Submitted by fourcornerclub t3_z9qx7a in MachineLearning
I was recently having this debate with a data engineering friend. My position was that as foundational models "eat the world" it will become more valuable to be good at sourcing high quality training data for finetuning that building new models. Would love to trigger a wider debate here!
alex_lite_21 t1_iyi9g5g wrote
I agree on this. Relates to the garbage in-garbage out concept. I am not also a fan of data augmentation, at least not in the way that it is commonly done without thinking to much. Getting high quality data is paramount.