Submitted by koyo4ever t3_10ugxmc in deeplearning
Appropriate_Ant_4629 t1_j7clc8s wrote
The LAION project ( https://laion.ai/ ) is probably the closest thing to this.
They have a great track record on similar scale projects. They've partnered with /r/datahoarders and volunteers on creation of training sets including their 5.8 billion image/text-pair dataset that they used to train a better version of CLIP.
Their actual training of models tends to be done on some of the larger European supercomputers, though. If I recall correctly, their CLIP-derivative was trained with time donated on JUWELS. Too hard to split up such jobs into average-laptop-sized tasks.
Viewing a single comment thread. View all comments