visarga t1_j9y7624 wrote on February 25, 2023 at 12:48 PM

Does it do only one round of retrieval?

davidmezzetti OP t1_j9y7tmq wrote on February 25, 2023 at 12:55 PM

With the current version, yes it runs an embeddings query for each message. I plan to handle threaded conversations shortly. In that scenario, the chat history will be provided to the prompt.

dancingnightly t1_ja2lfup wrote on February 26, 2023 at 11:00 AM

Is this current version mostly RAG + WebGPT semantic search to GPT answer, then?

Big fan of your recent work.

davidmezzetti OP t1_ja345mn wrote on February 26, 2023 at 2:20 PM

Thank you.

This application is RAG with a local vector index combined with a LLM from the FLAN-T5 series of models.

The whole solution can be locally hosted with no remote runtime API dependencies.