curiousshortguy t1_jad9s4t wrote on February 28, 2023 at 4:40 PM

Reply to comment by Beli_Mawrr in [R] Microsoft introduce Kosmos-1, a Multimodal Large Language Model (MLLM) that can perceive general modalities, learn in context (i.e., few-shot), and follow instructions (i.e., zero-shot) by MysteryInc152

it is, you can probably do 2 to 8 billion on your average gaming pc, and 16 on a high end one

AnOnlineHandle t1_jaeshwf wrote on February 28, 2023 at 10:30 PM

Is there a way to convert parameter count into vram requirements? Presuming that's the main bottleneck?

metal079 t1_jaeuymi wrote on February 28, 2023 at 10:47 PM

Rule of thumb is vram needed = 2x per billion parameters, though I recall pygamillion which is 6B says it needs 16GB of ram so it depends.

curiousshortguy t1_jaf3aab wrote on February 28, 2023 at 11:47 PM

Yeah, about 2-3. You can easily shove layers of the networks on disk, and then load even larger models that don't fit in vram BUT disk i/o will make inference painfully slow.

new_name_who_dis_ t1_jaf4lmy wrote on February 28, 2023 at 11:56 PM

Each float32 is 4 bytes.

[deleted] t1_jaeu7ev wrote on February 28, 2023 at 10:42 PM

[removed]