Submitted by happyhammy t3_zm51z0 in MachineLearning
My theory:
- no good datasets, as opposed to image datasets like LAION
- harder/illegal to get music datasets. Shady methods are usually required to get large music datasets (like torrenting). The only music datasets I've found are classical, and even then, very limited as performances of classical music are still copyrighted.
Therefore, large companies like OpenAI/Google are unable to take the risk in making a good generative music AI due to legal reasons. Startups have a better chance because they have less to lose and can better hide the fact that they trained their model with copyrighted material.
Other than that, I don't believe audio is more challenging to process than images because the complete audio file can be reduced to its spectrogram, which is just a 2D image.
TLDR: No good datasets
pucklermuskau t1_j0aacca wrote
Dadabots procedural deathmetal is awesome.
https://youtu.be/MwtVkPKx3RA