UseNew5079 t1_jecefwx wrote on March 31, 2023 at 12:03 AM

Reply to [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

Excellent quality responses from this model. This can be actually usable.

UseNew5079 t1_je9wrw6 wrote on March 30, 2023 at 2:05 PM

Reply to comment by Akimbo333 in How much smaller can a GPT-4-level model get? by Technologenesis

Check LLama paper: https://arxiv.org/pdf/2302.13971.pdf

Specifically this graph: https://paste.pics/6f817f0aa71065e155027d313d70f18c

They increase performance (reduce loss) with parameters or training time. More parameters just allow for faster and deeper initial drop in error/loss but later part looks the same for all model sizes. At least that is my interpretation.

UseNew5079 t1_je6lgyb wrote on March 29, 2023 at 7:52 PM

Reply to How much smaller can a GPT-4-level model get? by Technologenesis

Maybe 7b model can get GPT-4 level performance if trained for _very_ long. Facebook paper showed that performance increased until the end of training and it looks like there was no plateau. Maybe it's just very inefficient but possible? Or maybe there is another way.

UseNew5079 t1_jc2a9t1 wrote on March 13, 2023 at 3:00 PM

Reply to comment by sigoden in AIChat: A cli tool to chat with gpt-3.5/chatgpt in terminal. by sigoden

Oh i see. That makes sense. So if you try to send more than 4096 tokens the API will respond with error or will it charge you for >4096 tokens and just skip oldest?

UseNew5079 t1_jc204th wrote on March 13, 2023 at 1:47 PM

Reply to AIChat: A cli tool to chat with gpt-3.5/chatgpt in terminal. by sigoden

Really cool project! There should be a configuration option to limit size of input text to protect from mistakes like `cat big_big_file | aichat`

UseNew5079 t1_j8xiqa3 wrote on February 17, 2023 at 5:36 PM

Reply to Microsoft Killed Bing by Neurogence

At least they have shown what is possible. There is no going back.