endless_sea_of_stars

endless_sea_of_stars t1_jdskiit wrote

The advantage of in context learning is that it is trivial to add and remove plug-ins.

Training with the plug-ins is more powerful, but you can't really easily add or subtract. In theory training with APIs should result in a smaller model as the main model no longer needs to learn math or trivia (in theory).

2

endless_sea_of_stars t1_jdhrar6 wrote

Reply to comment by Izzhov in [N] ChatGPT plugins by Singularian2501

Sort of. The default retrieval plug-in is more of a database lookup. It converts a question into a word vector (via Ada api) and uses that to query a self hosted vector database. The base version is more for question/answer scenarios.

That being said, I'm sure that someone is already working on novel generator plug-in that would be more tailored to your use case.

1

endless_sea_of_stars t1_jdexqz3 wrote

  1. This massively increases the utility of ChatGPT. You can have it order food. You can have it query your data without paying for fine-tuning.

  2. This smooths over some of the base models' shortcomings. It can now call Wolfram for computations. It can lookup facts instead of making them up.

35

endless_sea_of_stars t1_jde88qi wrote

Wonder how this compares to the Toolformer implementation.

https://arxiv.org/abs/2302.04761

Their technique was to use few shot (in context) learning to annotate a dataset with API calls. They took the annotated dataset and used it to fine tune the model. During inference the code would detect the API call, make the call, and then append the results to the text and keep going.

The limitation with that methodology is that you have to fine tune the model for each new API. Wonder what OpenAIs approach is?

Edit:

I read through the documentation. Looks like it is done through in context learning. As in they just prepend the APIs description to your call and let the model figure it out. That also means you get charged for the tokens used in the API description. Those tokens also count against the context window. Unclear if there was any fine tuning done on the model to better support APIs or if they are just using the base models capabilities.

54

endless_sea_of_stars t1_jbmda5p wrote

> develop a method to separate knowledge retention and language pattern modeling. Think about learning the state capitals. A person quickly learns to say "the capital of X is Y" and then can substitute in different memorized facts. AI learns the facts and the sentence patterns all in the same manner.

This sounds like a problem Toolformer is supposed to address. Instead of learning all the state capitals learn to call. "The capital of Indiana is [QA(Indiana, capital)]."

1

endless_sea_of_stars t1_j8ik6a6 wrote

Meta released a paper about Toolformers (yeah, probably need to workshop that name) that allow LLMs to call out to APIs like a calculator. So instead of learning how to calculate a sqrt it would simply call a calculator.

This is a pretty big deal but hasn't got a lot of attention yet.

7

endless_sea_of_stars t1_j858dvn wrote

> abstract is meant is often a bit clickbaity.

Had a vision of a nightmare future where papers are written in click bait fashion.

Top Ten Shocking Properties of Positive Solutions of Higher Order Differential Equations and Their Astounding Applications in Oscillation Theory. You won't believe number 7!

78