Submitted by davidmezzetti t3_11bk12r in MachineLearning
visarga t1_j9y7624 wrote
Does it do only one round of retrieval?
davidmezzetti OP t1_j9y7tmq wrote
With the current version, yes it runs an embeddings query for each message. I plan to handle threaded conversations shortly. In that scenario, the chat history will be provided to the prompt.
dancingnightly t1_ja2lfup wrote
Is this current version mostly RAG + WebGPT semantic search to GPT answer, then?
Big fan of your recent work.
davidmezzetti OP t1_ja345mn wrote
Thank you.
This application is RAG with a local vector index combined with a LLM from the FLAN-T5 series of models.
The whole solution can be locally hosted with no remote runtime API dependencies.
Viewing a single comment thread. View all comments