AuspiciousApple OP t1_itwvq1f wrote
Reply to comment by _Arsenie_Boca_ in [D] What's the best open source model for GPT3-like text-to-text generation on local hardware? by AuspiciousApple
>I dont think any model you can run on a single commodity gpu will be on par with gpt-3.
That makes sense. I'm not an NLP person, so I don't have a good intuition on how these models scale or what the benchmark numbers actually mean.
In CV, the difference between a small and large model might be a few % accuracy on imagenet but even small models work reasonably well. FLAN T5-XL seems to generate nonsense 90% of the time for the prompts that I've tried, whereas GPT3 has great output most of the time.
Do you have any experience with these open models?
_Arsenie_Boca_ t1_ityccjh wrote
I dont think there is a fundamental difference between cv and nlp. However, we expect language models to be much more generalist than any vision model (Have you ever seen a vision model that performs well on discriminative and generative tasks across domains without finetuning?) I believe this is where scale is the enabling factor.
Viewing a single comment thread. View all comments