endless_sea_of_stars
endless_sea_of_stars t1_jdtdiar wrote
Reply to comment by light24bulbs in [P] Using ChatGPT plugins with LLaMA by balthierwings
Here is what we know about OpenAIs plug-ins. A compact API description gets prepended to the prompt. (In context) Technically it is few shot depending on which definitions you use. We don't know what if any fine-tuning of the model they did to get plug-ins working.
endless_sea_of_stars t1_jdskiit wrote
Reply to comment by light24bulbs in [P] Using ChatGPT plugins with LLaMA by balthierwings
The advantage of in context learning is that it is trivial to add and remove plug-ins.
Training with the plug-ins is more powerful, but you can't really easily add or subtract. In theory training with APIs should result in a smaller model as the main model no longer needs to learn math or trivia (in theory).
endless_sea_of_stars t1_jdhrar6 wrote
Reply to comment by Izzhov in [N] ChatGPT plugins by Singularian2501
Sort of. The default retrieval plug-in is more of a database lookup. It converts a question into a word vector (via Ada api) and uses that to query a self hosted vector database. The base version is more for question/answer scenarios.
That being said, I'm sure that someone is already working on novel generator plug-in that would be more tailored to your use case.
endless_sea_of_stars t1_jdg0ouh wrote
Reply to comment by ZenDragon in [N] ChatGPT plugins by Singularian2501
I realize that the Wolfram plug-in has a leg up already. The base model has been trained on the Wolfram language and documentation so it doesn't have to rely entirely on in context learning.
endless_sea_of_stars t1_jdfttsl wrote
Reply to comment by GrowFreeFood in [N] ChatGPT plugins by Singularian2501
Plug-in in computer science terms is a way to add functionality to an app without changing its core code. A mod for Minecraft is a type of plug-in.
For ChatGPT it is a way for it to call programs that live outside its servers.
endless_sea_of_stars t1_jdfqgjz wrote
Reply to comment by GrowFreeFood in [N] ChatGPT plugins by Singularian2501
Read the link at the tip of the thread.
endless_sea_of_stars t1_jdezatt wrote
Reply to comment by iamspro in [N] ChatGPT plugins by Singularian2501
I suspect future versions will do both. They will "bake in" some basic APIs like simple calculator, calendar, fact look ups. They will use in context for 3rd party APIs.
endless_sea_of_stars t1_jdexqz3 wrote
Reply to comment by Puzzleheaded_Acadia1 in [N] ChatGPT plugins by Singularian2501
-
This massively increases the utility of ChatGPT. You can have it order food. You can have it query your data without paying for fine-tuning.
-
This smooths over some of the base models' shortcomings. It can now call Wolfram for computations. It can lookup facts instead of making them up.
endless_sea_of_stars t1_jde88qi wrote
Reply to [N] ChatGPT plugins by Singularian2501
Wonder how this compares to the Toolformer implementation.
https://arxiv.org/abs/2302.04761
Their technique was to use few shot (in context) learning to annotate a dataset with API calls. They took the annotated dataset and used it to fine tune the model. During inference the code would detect the API call, make the call, and then append the results to the text and keep going.
The limitation with that methodology is that you have to fine tune the model for each new API. Wonder what OpenAIs approach is?
Edit:
I read through the documentation. Looks like it is done through in context learning. As in they just prepend the APIs description to your call and let the model figure it out. That also means you get charged for the tokens used in the API description. Those tokens also count against the context window. Unclear if there was any fine tuning done on the model to better support APIs or if they are just using the base models capabilities.
endless_sea_of_stars t1_jcrv26g wrote
Reply to comment by redpandabear77 in [R] ChatGLM-6B - an open source 6.2 billion parameter Eng/Chinese bilingual LLM trained on 1T tokens, supplemented by supervised fine-tuning, feedback bootstrap, and RLHF. Runs on consumer grade GPUs by MysteryInc152
Outside of criticizing government or religion can you name an illegal topic anywhere?
endless_sea_of_stars t1_jbmda5p wrote
Reply to comment by bivouac0 in [D] Why are so many tokens needed to train large language models? by blacklemon67
> develop a method to separate knowledge retention and language pattern modeling. Think about learning the state capitals. A person quickly learns to say "the capital of X is Y" and then can substitute in different memorized facts. AI learns the facts and the sentence patterns all in the same manner.
This sounds like a problem Toolformer is supposed to address. Instead of learning all the state capitals learn to call. "The capital of Indiana is [QA(Indiana, capital)]."
endless_sea_of_stars t1_j8ik6a6 wrote
Reply to comment by Lionfyst in ChatGPT Passed a Major Medical Exam, but Just Barely | Researchers say ChatGPT is the first AI to receive a passing score for the U.S. Medical Licensing Exam, but it's still bad at math. by chrisdh79
Meta released a paper about Toolformers (yeah, probably need to workshop that name) that allow LLMs to call out to APIs like a calculator. So instead of learning how to calculate a sqrt it would simply call a calculator.
This is a pretty big deal but hasn't got a lot of attention yet.
endless_sea_of_stars t1_j858dvn wrote
Reply to comment by Iunaml in [P] Introducing arxivGPT: chrome extension that summarizes arxived research papers using chatGPT by _sshin_
> abstract is meant is often a bit clickbaity.
Had a vision of a nightmare future where papers are written in click bait fashion.
Top Ten Shocking Properties of Positive Solutions of Higher Order Differential Equations and Their Astounding Applications in Oscillation Theory. You won't believe number 7!
endless_sea_of_stars t1_j627a9m wrote
Reply to comment by DigThatData in [R] Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers by currentscurrents
Just rent out an AWS region for a month and you'll be good to go. Hold a couple bake sales to defray the cost.
endless_sea_of_stars t1_jdtik00 wrote
Reply to comment by light24bulbs in [P] Using ChatGPT plugins with LLaMA by balthierwings
It is mostly public information. The API developer is required to make a specification document that describes the API. This gets injected into the prompt. They may transform it from json to something the model better understands. It may also inject some other boilerplate text.