Ok-Cartoonist8114 t1_j52mjrw wrote on January 19, 2023 at 11:33 PM

Here is a great paper from IBM following the retriever-reader paradigm. Love those "light" models that can be specialized by switching index.

IMO the loss of ChatGPT is still interesting for retriever-reader approachs to generate either human like or structured answers from input documents.

Here is a tool I made to create retriever-reader pipeline in a minute: Cherche, would recommend also Haystack on github !

IntrepidTieKnot t1_j547lq2 wrote on January 20, 2023 at 7:23 AM

I made a tool that chops documents in chunks, creates embeddings for the chunks via GPT-3 and stores the embeddings in a REDIS database. When I make a query, I create an embedding for that and look up my stored embeddings via cosine similarity.

My question is: isn't that the same as your tool does? In other words: what can you do with Cherche what I cannot do like I described? Is it that I don't need GPT-3 for the same result? Or what is it?

Ok-Cartoonist8114 t1_j54l5yh wrote on January 20, 2023 at 10:25 AM

Your pipeline is fine! Cherche is not fancy, it just allow to create hybrid pipelines that rely both on language models and lexical matching which can help a lot. Also Cherche is primarly design for computing embeddings with Sentence Transformers which have a better ratio <precision / number of parameters>.