ml-research t1_j4fpav0 wrote on January 15, 2023 at 11:28 AM

Thanks for sharing!

> The website works by fetching new papers daily from arxiv.org, using PapersWithCode to filter out the most relevant ones.

What do you mean by "relevant"? What kinds of papers do you fetch?

niclas_wue OP t1_j4fqqy6 wrote on January 15, 2023 at 11:47 AM

Thanks for asking! My first prototype collected all new arxiv papers in certain ML-related categories via the API, however I quickly realized that this would be way to costly. Right now, I collect all papers from PapersWithCode's "Top" (last 30 days) and the "Social" Tab, which is based on Twitter likes and retweets. Finally, I filter using this formula:

p.number_of_likes + p.number_of_retweets > 20 or p.number_github_stars > 100

In rare cases, when the paper is really long or not parsable with "grobid", I will exclude the paper for now.