MysteryInc152 OP t1_j95r8ni wrote on February 19, 2023 at 1:36 PM

Reply to comment by ilovethrills in [D] Toolformer implementation using only few-shot prompting by MysteryInc152

Much simpler approach compared to langchain ( and this is self supervised) but they attempt to do the same thing.

yoshiwaan t1_j96wt2g wrote on February 19, 2023 at 6:44 PM

Really? As in the order of operations is: token parsing => Toolformer => LLM?

Genuine question, is the text/token parsing for queries to an LLM (eg chatgpt) performed separately and beforehand to the actual LLM being leveraged, or is the text/token parsing a part of the LLM? I figured it was the latter and you couldn’t just insert a tool there

Edit: I think this is a new model for this purpose, rather than reusing an existing LLM (eg ChatGPT) as I first assumed, which makes more sense

Edit 2: I actually read the paper and the LM itself is taught to reach out to tools as a part of its response operations, it’s not something separate

MysteryInc152 OP t1_j96y474 wrote on February 19, 2023 at 6:53 PM

It's not a new model. It's davinci-003.

Basically the model begins generating. Once it hits an API request, the request is received and sent and the result of the request is pasted back into text and sent back to open AI to generate again and gpt continues generating until it hits another request and the process is repeated till it's done generating.