GandhisLittleHelper t1_j9xwlz9 wrote on February 25, 2023 at 10:31 AM

Reply to comment by JGoodle in [D] Simple Questions Thread by AutoModerator

The same but feeding frames from the videos into model like a CNN-RNN model which keeps memory of previous frames, but will obviously be a much bigger dataset.

GandhisLittleHelper t1_j9xwca7 wrote on February 25, 2023 at 10:27 AM

Reply to [D] Simple Questions Thread by AutoModerator

Has anyone made spectrogram 2 spectrogram models for music analysis, specifically demixing such as isolating vocals? I’m currently using a Mel spectrogram for the input and output but struggling to get good results. Using a hop length=512,n_fft=2048, no_mels=128. My model is currently a bi directional GRU model with 3 layers and a 256 hidden size.Does anyone know a good model type to use and/or good audio transformations for this project?

Cheers