Paper Review: Voice-ENHANCE: Speech Restoration using a Diffusion-based Voice Conversion Framework

이용준·2025년 5월 30일
0

Paper Review

목록 보기
13/15

Introduction

  • Speech Restoration is a complicated task that deals with multiple acoustic distortions such as reverberation, band-limitation and more.
  • Such a complicated task can be simplified if the task is separated into steps.

Proposed Methods

  • To address the task, this method GSR first does speaker-agnostic restoration, followed by a mel spectrogram restoration with speaker style injection.
  • The restored mel spectrogram is fed into HiFi-GAN to generate a restored speech.

Speaker-agnostic restoration

  • A ResUNet based method is used for speaker-agnostic restoration.

VC inspired restoration

  • Diff-VC-inspired framework is used.

  • A speaker encoder is ECAPA-TDNN.

  • A content encoder is transformer-based, improved with putting HuBERT-VQ before the content encoder

  • The training is done with the weighted sum of two loss functions, first of which is a reconstruction loss that would make M^\hat{M} closer to the clean mel filterbank.

  • Second loss is a diffusion loss that make the final output mel spectrogram that was fed with both the speaker embedding and the content embedding closer to the clean mel filterbank.

    Ltotal=Ld+αLenc,Lenc(M0,M^)=D(Mo,M^),Ld(M0)=Diffusion lossL_{total}=L_d+\alpha L_{enc}, \\ L_{enc}(M_0,\hat{M})=D(M_o,\hat{M}), \\ L_d(M_0)=Diffusion\ loss

    Results & Experiments

    Thoughts

  • The authors highlight that VC-inspired step helps restore many of the damaged speech components. They don't say that the step actually ensure the style-injection. Then is speaker-agnostic restoration really speaker-agnostic? How can you tell that the tasks are decoupled?

profile
Ad libitum

1개의 댓글

comment-user-thumbnail
2025년 8월 1일

Water damage can escalate quickly, turning a minor leak into a major problem within hours. Mold growth, structural weakening, and electrical hazards are just a few of the dangers. That’s why calling professionals right away is essential. They have the tools and expertise to assess and address the damage fast. In cases of flooding or extreme weather, Storm damage cleanup becomes even more critical. Waiting can lead to costly repairs and long term health risks. Protect your home and family don’t wait. Immediate professional intervention can make all the difference in restoring safety and preventing further destruction.

답글 달기