Submitted by besabestin t3_10lp3g4 in MachineLearning
cdsmith t1_j60q0bs wrote
Reply to comment by Taenk in Few questions about scalability of chatGPT [D] by besabestin
I can only answer about Groq. I'm not trying to sell you Groq hardware, honestly... I just honestly don't know the answers for other accelerator chips.
Groq very likely increases inference speed and power efficiency over GPUs; that's actually its main purpose. How much depends on the model, though. I'm not in marketing so I probably don't have the best resources here, but there are some general performance numbers (unfortunately no comparisons) in this article, and this one talks about a very specific case where a Groq chip gets you a 1000x inference performance advantage over the A100.
To run a model on a Groq chip, you would typically start before CUDA enters the picture at all, and convert from PyTorch, Tensorflow, or a model in several other common formats into a Groq program using https://github.com/groq/groqflow. If you have custom-written CUDA code, then it's likely you've got some programming work ahead of you to run on something besides a GPU.
lucidrage t1_j61so7l wrote
>convert from PyTorch, Tensorflow, or a model in several other common formats into a Groq program
Are there any effort spend in adding a plugin for a high level framework like keras to automatically use groq?
cdsmith t1_j62a3yv wrote
I'm not aware of any effort to build it into Keras, but Keras models are one of the things you can easily convert to Groq chips using groqflow.
Viewing a single comment thread. View all comments