MysteryInc152 OP t1_j93saow wrote on February 19, 2023 at 12:49 AM

#1,864,272

In-context learning is overpowered lol. Really is implicit finetuning.

This is the prompt template

>You are an AI assistant with several tools available to you. The tools are the following:
>
>MATH: This allows you to evaluate mathematical expressions using the math.js library.
>
>NOW: This returns the current date and time. You must not pass any arguments to this tool!
>
>SEARCH: This returns a web search result for the given string argument.
>
>DO NOT USE TOOLS WITHIN TOOLS! KEEP ALL TOOL CALLS SEPARATE FROM EACH OTHER!
>
>User: What is 10 times 14?
>
>Assistant: 10 * 14 is [MATH(10 * 14) -> 140] 140.
>
>User: How old is Donald Trump?
>
>Assistant: Donald Trump is [SEARCH(Donald Trump age) -> 76] 76 years old.
>
>User: What day is it?
>
>Assistant: It is [NOW() -> 13/02/2023, 20:01:35] the 13th of February 2023.
>
>User: Has Avatar the way of water been released yet?
>
>Assistant: [SEARCH(Avatar the way of water release date) -> 22.11.2022] Avatar: The way of water was released on the 22nd of november 2022. Today is [NOW() -> DATE_TODAY] the 13th of February 2023. Therefore, [MATH(2023 > 2022) -> true] it was released last year.
>
>User: Where was the inventor of the light bulb born, and what is the last country his home country was at war with?
>
>Assistant: Thomas Edison, the inventor of the lightbulb, was born in [SEARCH(Thomas Edison birthplace) -> Milan, Ohio] Milan, Ohio. The last country the United States was at war with was [SEARCH(last country US at war with) -> Iraq] Iraq.
>
>User: USER_INPUT
>
>Assistant:

blueSGL t1_j94bno5 wrote on February 19, 2023 at 3:26 AM

#1,865,903

Replying to MysteryInc152 (#1,864,272)

Let me see if I get this right.

Toolformerzero is a layer between the LLM and the user.

That layer picks up keywords, performs the search and then returns a predefined chunk formatted from the search results

Then the LLM's prompt is stuffed with that chunk and asked the question again?

and it just works?

MysteryInc152 OP t1_j94ep4b wrote on February 19, 2023 at 3:52 AM

#1,866,108

Replying to blueSGL (#1,865,903)

Yup. That's pretty much it lol

HarryCHK t1_j94vaii wrote on February 19, 2023 at 6:39 AM

#1,867,377

The thread just say those external tool can be a source if memory so that it will be turing complete. How this compare to embed the memory tape into the architecture itself?

blueSGL t1_j94yv6s wrote on February 19, 2023 at 7:24 AM

#1,867,627

Replying to MysteryInc152 (#1,866,108)

any idea how they format the search results, because out of all of them that would seem to be the most tricky. No idea if the google summery text preview contains the answer or enough context to get the answer. If it needs to actually go to the website the tool has no knowledge of how the website will be formatted or length of the site. (potential context window issues)

_Minos t1_j95amf3 wrote on February 19, 2023 at 10:05 AM

#1,868,321

Replying to blueSGL (#1,867,627)

Hey, creator of above implementation here.

You're right that there's lots of ways accuracy could feasibly be improved, by using more varied APIs, navigating to search results and creating embeddings of the resulting website etc. Ultimately, a lot of this kind of more advanced chaining of LLM and API requests can be done with libraries like langchain.

For this one, i wanted to show how effective a much more simple approach can be. For search results, i simply chain together the returned google "snippets" and inject the resulting string back into the prompt. Often times, this means there can actually be conflicting information, such as for example dates talking about events adjacent to but ultimately irrelevant to the search query. However, this is where GPT is generally doing an excellent job of picking out the correct bit of info, so no more sophisticated filtering or parsing by the app is required. Just giving a raw dump of the search results to the model.

pyepyepie t1_j95f3m2 wrote on February 19, 2023 at 11:10 AM

#1,868,600

Replying to _Minos (#1,868,321)

I actually think your approach shows the idea better than the original paper. However, the original paper can be implemented with smaller language models which might be better for people who want to deploy it. All over, I think the application is almost trivial and I am not surprised it worked well for you (due to the crazy power of LLMs).

Great work!

badabummbadabing t1_j95kmxk wrote on February 19, 2023 at 12:24 PM

#1,868,988

Replying to MysteryInc152 (#1,864,272)

This is absolutely wild.

ilovethrills t1_j95p6p7 wrote on February 19, 2023 at 1:15 PM

#1,869,366

Is this like langchain?

MysteryInc152 OP t1_j95r8ni wrote on February 19, 2023 at 1:36 PM

#1,869,548

Replying to ilovethrills (#1,869,366)

Much simpler approach compared to langchain ( and this is self supervised) but they attempt to do the same thing.

Taenk t1_j95rfg2 wrote on February 19, 2023 at 1:38 PM

#1,869,558

Can you please link the demo without going through twitter? It won’t load for me.

MysteryInc152 OP t1_j95rp8c wrote on February 19, 2023 at 1:40 PM

#1,869,586

Replying to Taenk (#1,869,558)

https://toolformerzero.com/

Professor_Entropy t1_j95txkb wrote on February 19, 2023 at 2:01 PM

#1,869,791

You can't still rely on its results. "What's the volume of 1000 KG of Ice?" doesn't work, the model asks "1000 * 0.919" instead of "1000 / 0.919"

MysteryInc152 OP t1_j95u3t2 wrote on February 19, 2023 at 2:03 PM

#1,869,812

Replying to Professor_Entropy (#1,869,791)

Seems like something a chain of thought example in the pre prompt would fix more than any deficiency in the approach.

Also eliminating arithmetic errors doesn't mean you'd eliminate logical/reasoning errors.

damc4 t1_j96888m wrote on February 19, 2023 at 3:54 PM

#1,871,030

By the way, I created a tool "CodeAssist" ( https://codeassist.tech ) that is based on a similar idea. It's a chatbot that can execute actions in the IDE (most importantly - write/read the code in your editor).

yoshiwaan t1_j96uxg7 wrote on February 19, 2023 at 6:31 PM

#1,873,005

Replying to blueSGL (#1,865,903)

Really? As in the order of operations is: token parsing => Toolformer => LLM?

Genuine question, is the text/token parsing for queries to an LLM (eg chatgpt) performed separately and beforehand to the actual LLM being leveraged, or is the text/token parsing a part of the LLM? I figured it was the latter and you couldn’t just insert a tool there

yoshiwaan t1_j96wt2g wrote on February 19, 2023 at 6:44 PM

#1,873,180

Replying to MysteryInc152 (#1,869,548)

Really? As in the order of operations is: token parsing => Toolformer => LLM?

Genuine question, is the text/token parsing for queries to an LLM (eg chatgpt) performed separately and beforehand to the actual LLM being leveraged, or is the text/token parsing a part of the LLM? I figured it was the latter and you couldn’t just insert a tool there

Edit: I think this is a new model for this purpose, rather than reusing an existing LLM (eg ChatGPT) as I first assumed, which makes more sense

Edit 2: I actually read the paper and the LM itself is taught to reach out to tools as a part of its response operations, it’s not something separate

MysteryInc152 OP t1_j96y474 wrote on February 19, 2023 at 6:53 PM

#1,873,274

Replying to yoshiwaan (#1,873,180)

It's not a new model. It's davinci-003.

Basically the model begins generating. Once it hits an API request, the request is received and sent and the result of the request is pasted back into text and sent back to open AI to generate again and gpt continues generating until it hits another request and the process is repeated till it's done generating.