fourcornerclub OP t1_iylpygx wrote
Reply to comment by no_witty_username in [Discussion] - "data sourcing will be more important than model building in the era of foundational model fine-tuning" by fourcornerclub
u/no_witty_username and yet the standard in data sourcing still seems to be "let me see what's open source, and what I can scrape from the internet, and then I'll tune the model from there". Makes no sense to me!
FlattenLayer t1_iylz5rc wrote
CTR model was built to predict click-through rates in recommendation systems like TikTok and google and the model was fed tens of billions of samples from the exposure logs. In this case, the most important thing is keeping the exposure log clean. But it's not easy because there is a complex and long pipeline from the exposure log to training samples.
Viewing a single comment thread. View all comments