CeFurkan OP t1_j7tob1u wrote
Reply to comment by logsinh in [D] Are there any AI model that I can use to improve very bad quality sound recording? Removing noise and improving overall quality by CeFurkan
>Nvidia RTX voice
example link that you can download extract audio quickly if you wish : https://youtu.be/2zY1dQDGl3o
also here 5 min example speech : https://sndup.net/stjs/
logsinh t1_j7tqjmm wrote
The audio is a bit distorted possibly due to noise gating. I don't see too much noise, so maybe noise reduction is not what you need. The audio has 8 kHz bandwidth (16 kHz sample rate), maybe you may try to use an audio super-resolution network such as https://github.com/mindslab-ai/nuwave2 to increase the audio bandwidth.
CeFurkan OP t1_j7tr5at wrote
yes i had tried some options obs back in time. it was probably noise gate. even i forgotten it.
thank you so much for reply gonna test that repo now
CeFurkan OP t1_j7trlei wrote
their example really good improvement but do i need training for that?
opened an issue thread but not much hope : https://github.com/mindslab-ai/nuwave2/issues/11
logsinh t1_j7tu0x1 wrote
Just download the checkpoint and use the command at Inference session. sr should be 16000
CeFurkan OP t1_j7tvbny wrote
thanks i made it work
however i got out of memory error on RTX 3060 - 12 GB vram
it is like a joke :/
logsinh t1_j7uw4wb wrote
Process with a sliding window would solve your problem, see e.g. https://colab.research.google.com/github/asteroid-team/asteroid/blob/master/notebooks/04_ProcessLargeAudioFiles.ipynb
CeFurkan OP t1_j7vouon wrote
thanks
no idea where to put this code in nuwave2
logsinh t1_j7tsvku wrote
Anyway, here is the denoised audio of your example speech: https://www.sndup.net/pbxf/. There is no improvement, your best bet is audio super-resolution.
Input: Speech MOS: 4.259 Noise MOS: 4.369 Overall MOS: 3.927
Output: Speech MOS: 4.263 Noise MOS: 4.403 Overall MOS: 3.947
CeFurkan OP t1_j7ttvbm wrote
>audio super-resolution
thank you so much for answers and testing
any idea to get super resolution ? or my only option is mindslab-ai/nuwave2 ?
Viewing a single comment thread. View all comments