Viewing a single comment thread. View all comments

fourcornerclub OP t1_iylpygx wrote

u/no_witty_username and yet the standard in data sourcing still seems to be "let me see what's open source, and what I can scrape from the internet, and then I'll tune the model from there". Makes no sense to me!

2

FlattenLayer t1_iylz5rc wrote

CTR model was built to predict click-through rates in recommendation systems like TikTok and google and the model was fed tens of billions of samples from the exposure logs. In this case, the most important thing is keeping the exposure log clean. But it's not easy because there is a complex and long pipeline from the exposure log to training samples.

2