Submitted by niclas_wue t3_10cgm8d in MachineLearning
ml-research t1_j4fpav0 wrote
Thanks for sharing!
> The website works by fetching new papers daily from arxiv.org, using PapersWithCode to filter out the most relevant ones.
What do you mean by "relevant"? What kinds of papers do you fetch?
niclas_wue OP t1_j4fqqy6 wrote
Thanks for asking! My first prototype collected all new arxiv papers in certain ML-related categories via the API, however I quickly realized that this would be way to costly. Right now, I collect all papers from PapersWithCode's "Top" (last 30 days) and the "Social" Tab, which is based on Twitter likes and retweets. Finally, I filter using this formula:
p.number_of_likes + p.number_of_retweets > 20 or p.number_github_stars > 100
In rare cases, when the paper is really long or not parsable with "grobid", I will exclude the paper for now.
Viewing a single comment thread. View all comments