Submitted by Time_Key8052 t3_125pbbf in deeplearning
Comments
I_will_delete_myself t1_je6h8xr wrote
The domain name and the prefix also doesn't make it seem sketch at all whatsoever. tistory.com and gpt4chat makes me think it's trying to abuse SEO
Orngog t1_jedpdzw wrote
Why tistory, I feel I'm missing something
sEi_ t1_je6yhig wrote
I do not know their model, but playing with a 13B model, albeit small is fun on my potato PC (Alpaca 13B). Fun, but nothing more than that.
Praise_AI_Overlords t1_je7pw83 wrote
Curie is 6.7B and it is surprisingly strong.
I_will_delete_myself t1_jeewl7u wrote
Personally I think the limits with those models is just the amount of information that each weight can hold is limited.
Praise_AI_Overlords t1_jeez5po wrote
That is very likely. I wonder how this works for multimodality. Weights would probably have to hold more.
artsybashev t1_je65qs7 wrote
13B model is quite small. Given that the company is focusing in AI hardware, the dataset and other parts of the model might be lagging a bit. Lack of comparison to other models also suggests that the performance is not that good.