ThatInternetGuy t1_j2d5nkm wrote on December 31, 2022 at 10:40 AM

170GB VRAM minimum.

So that's 8x RTX 4090.

3deal t1_j2d8bj5 wrote on December 31, 2022 at 11:17 AM

I mean, for a startup it is not very expensive for all the benefit it gives.

Can the 4090 pool their VRAM? I always thought that LLMs need GPUs from the A/V series so that they can pool memory. Am I wrong in thinking that?

You can do pipeline parallelism via FairScale and HF Accelerate on any identical (and sometimes non identical) GPUs.

Need to deploy the inference model with Colossal AI.