Viewing a single comment thread. View all comments

Lawjarp2 t1_j9uj86z wrote

In some tasks the 7B model seems close enough to the orginal gpt-3 175B. With some optimization it probably can be run on a good laptop with a reasonable loss in accuracy.

13B doesn't outperform in everything however 65B one does. But it's kinda weird to see their 13B model be nearly as good their 65B one.

However all their models are worse than the biggest Minerva model.

4