gliptic t1_jee0fbk wrote on March 31, 2023 at 10:03 AM

Reply to comment by yehiaserag in [P] Introducing Vicuna: An open-source language model based on LLaMA 13B by Business-Lead2679

Delta weights doesn't mean LoRA. It's just the difference (e.g. XOR) of their new weights and the original weights.

gliptic t1_jd2bsc7 wrote on March 21, 2023 at 10:03 AM

Reply to comment by lurkinginboston in [Project] Alpaca-30B: Facebook's 30b parameter LLaMa fine-tuned on the Alpaca dataset by imgonnarelph

In fact, GPT3 is 175B. But GPT3 is old now and doesn't make effective use of those parameters.

gliptic t1_jcjpy0h wrote on March 17, 2023 at 10:00 AM

Reply to comment by cipri_tom in [R] RWKV 14B ctx8192 is a zero-shot instruction-follower without finetuning, 23 token/s on 3090 after latest optimization (16G VRAM is enough, and you can stream layers to save more VRAM) by bo_peng

What's wrong with Arveycavey ;).

gliptic t1_j99y0cp wrote on February 20, 2023 at 11:12 AM

Reply to [D] Large Language Models feasible to run on 32GB RAM / 8 GB VRAM / 24GB VRAM by head_robotics

RWKV can run on very little VRAM with Rwkvstic streaming and 8-bit. I've not tested streaming, but I expect it's a lot slower. 7B model sadly takes 8 GB with just 8-bit quantization.

gliptic t1_irmzeaq wrote on October 9, 2022 at 2:34 PM

Reply to comment by RecklessRelentless99 in Enjoy the details. I work 16 hours edit and merge 380 RAW images of the moon and the final result was worth it by daryavaseum

The saturation is just turned up to reveal subtle differences in color. The moon is naturally almost monochromatic with any sensor.