kingfung1120 t1_iriks76 wrote on October 8, 2022 at 1:54 PM

What is the type of data that you are inputing into the model?

perfopt OP t1_iril212 wrote on October 8, 2022 at 1:57 PM

The data is MFCCs created from audio files. Sort of like this - https://www.youtube.com/watch?v=szyGiObZymo

kingfung1120 t1_iripkba wrote on October 8, 2022 at 2:38 PM

I haven’t handled audio data before, but it seems like you are flattening a [1723, 13]shape data into a vector(correct me if I am wrong), which is definitely going to affect the information that the model can learn since the data is sequential and it is in 2-D.

Unfortunately, I haven’t studied/read anything related to audio data deep learning, I couldn’t give you anymore in-depth opinion, but based on my understanding, using a CNN or anything recurrent should improve the model performance better than fine-tuning a MLP.

perfopt OP t1_iriqicz wrote on October 8, 2022 at 2:46 PM

Yes you are correct. I am flattening (1723, 13) shape data.

I will try out CNN as well.

kingfung1120 t1_irirqor wrote on October 8, 2022 at 2:57 PM

Look forward to receiving updates from you ;)

perfopt OP t1_iritkyy wrote on October 8, 2022 at 3:13 PM

Certainly. I've got to travel a couple of days but Tue after work I'll be back on this.

perfopt OP t1_is5q78g wrote on October 13, 2022 at 2:27 PM

Got back to a totally crazy week at work. Finally got time to spend on my project. I think I need to simplify my inputs and give MFCC another try before jumping into CNNs