Submitted by hapliniste t3_10g5r52 in MachineLearning
With ChatGPT going mainstream and the general push to make products out of LM, a problem remain about the cost of running such models.
To me, it seems counterproductive to put both language modelling and knowledge inside the model weights.
Is it time to shift to retrieval LM like Retro to keep the cost down while offering the same products?
It would possibly allow Google or others to offer a free assistant service, using embeddings similarity search to retrieve results from the Internet so the model itself could possibly even run on edge devices?
What are your thoughts about that subject?
hapliniste OP t1_j50pe93 wrote
Also, I think this could help improve the actual "logic" of the model by focusing the small LM on that task while the search part would serve the role of knowledge base.
Another benefit could be the ability to cite its sources.
It really seems like a no brainer to me.