(I only want to do inference, I don't need to finetune it.)

I want to use very-large language model (#parameters > 100B) to do some experiments, is that true the only very-large language model we can get access to is GPT3 API? Can we possibly get access to PaLM and Flan-PaLM 540B with no cost by chance?

I have searched over the internet but can't find a definite answer. As GPT-3 pricing for text-davinci-2 is not cheap, I am wondering if there's a chance to use other models.

Also, I can request up to 372GB VRAM, is there any large language model (#parameters > 100B) that I can actually download and run "locally"?

Comments

You must log in or register to comment.

allwordsaremadeup t1_iw83d9j wrote on November 13, 2022 at 6:08 PM

Bloom not big enough for you? 176B parameters. Can be downloaded here: https://huggingface.co/bigscience/bloom

learn-deeply t1_iwakal2 wrote on November 14, 2022 at 4:54 AM

Bloom-175B is worse than NeoX-20B from empircal testing. Use NeoX and save on resources or OPT-175

marvelous_madness t1_iwo67o8 wrote on November 17, 2022 at 2:15 AM

Isn't OPT-175 only available to researchers for non-commercial use?

learn-deeply t1_iwoglwu wrote on November 17, 2022 at 3:40 AM

Yes, the post talks about research, not commercial use.

blarg7459 t1_iw94geo wrote on November 13, 2022 at 10:03 PM

Here's how to run it locally

https://towardsdatascience.com/run-bloom-the-largest-open-access-ai-model-on-your-desktop-computer-f48e1e2a9a32

Veneck t1_iw83el8 wrote on November 13, 2022 at 6:09 PM

https://opt.alpa.ai/#generation

Courtesy of Meta.

ML4Bratwurst t1_iwaqxyc wrote on November 14, 2022 at 6:09 AM

Do you really need such a big transformer? There a a couple of big transformers available in the Huggins face API.

Gnabenmeister t1_iwb3l2s wrote on November 14, 2022 at 9:05 AM

I think metas was also published?

csreid t1_iwlyxnz wrote on November 16, 2022 at 5:00 PM

>Also, I can request up to 372GB VRAM, is there any large language model (#parameters > 100B) that I can actually download and run "locally"?

I've never done anything non-trivial with LLMs but even using 32 bit floats for 100B parameters should take 400 gigs of RAM, right?

SJ5125 t1_iwwfez9 wrote on November 18, 2022 at 9:30 PM

Most would use bfloat16 for LLMs