rya794 t1_jdrypjf wrote on March 26, 2023 at 6:51 PM

#2,374,977

Yea, it would be nice.

But what benefit does any LLM provider gain by implementing/adhering to an open protocol? OpenAI is trying to build a moat around their service, from their perspective plugins are key to establishing a competitive advantage.

I can’t see this happening in reality.

ThirdMover t1_jdrzd7f wrote on March 26, 2023 at 6:55 PM

#2,375,061

Replying to rya794 (#2,374,977)

That depends on how well they will be able to keep their moat. There is a lot of hunger for running LLMs on your own - if not hardware than at least in software environments you control. People want to see what makes them tick rather than trust "Open"AIs black boxes.

Yeah they have a performance lead but time will tell how well they can stay ahead of the rest of the field trying to catch up.

rya794 t1_jds0xqs wrote on March 26, 2023 at 7:06 PM

#2,375,253

Replying to ThirdMover (#2,375,061)

I don’t think so, I suspect my argument holds no matter who is running the most advanced LLM. The market leader will never have an incentive to open source their “app store”.

The only way this breaks down is if by some miracle, an open source model takes and maintains the lead.

ThirdMover t1_jds1kid wrote on March 26, 2023 at 7:11 PM

#2,375,320

Replying to rya794 (#2,375,253)

The lead may not always be obvious and the trade off from transparency may be worth it. LLMs (or rather "foundation models") will continue to capture more and more areas of competence. If I want one that - for example - forms the front end chat bot to a store I have so that people can ask for product explanations, do I need then the 500 IQ GPT-7 that won two Nobel prizes last year?

I think it's most likely that there will always be black box huge models that form the peak of what is possible with machine intelligence but what people use and interact with in practice will simply be "good enough" smaller and open source models.

mcilrain t1_jds23vc wrote on March 26, 2023 at 7:15 PM

#2,375,381

Replying to rya794 (#2,374,977)

Once competition kicks in AIs are going to be accessing every API accessible on the web by default.

light24bulbs t1_jds3mdl wrote on March 26, 2023 at 7:26 PM

#2,375,567

What's the underlying approach here? Just prompt engineering right?

I really really want to apply the ToolFormer paper to llama. They're both Facebook systems, you can get they've done it.

ToolFormer just seems like SUCH a good and thorough approach. There are quite a few gaps between the paper and building a working example, IMO, but it's clearly doable.

The way Facebook licensed the weights is frustrating me. We should all be passing around Alpaca trained, GPTQ quantized, SparseGpt optimized Llama derived models by now. Is there some telegram group i need to be in or something?

Dwanyelle t1_jds42hs wrote on March 26, 2023 at 7:29 PM

#2,375,619

Replying to ThirdMover (#2,375,320)

Exactly. It's not "what's the most impressive model possible?". It's "what's the most impressive model possible that can run on $1000 or less of hardware?"

beryugyo619 t1_jds9oz8 wrote on March 26, 2023 at 8:08 PM

#2,376,266

Replying to ThirdMover (#2,375,061)

Yeah the only advantage they have seems just couples of <500GB model weights in their hand, solely by being the first mover, without much else to back it up.

rya794 t1_jdsev38 wrote on March 26, 2023 at 8:44 PM

#2,376,956

Replying to ThirdMover (#2,375,320)

Yea, I agree with this, but I still don’t see what advantage the state of the art providers receive by adhering to an open protocol. If anything doing so would (on the margin) push users towards open source models when they might have been willing to pay for a more advanced model just to access certain plugins.

That being said, I do think that a standardized approach to a plugin ecosystem will arise. I just think it’s silly to expect any of the foundation model providers to participate.

endless_sea_of_stars t1_jdskiit wrote on March 26, 2023 at 9:24 PM

#2,377,696

Replying to light24bulbs (#2,375,567)

The advantage of in context learning is that it is trivial to add and remove plug-ins.

Training with the plug-ins is more powerful, but you can't really easily add or subtract. In theory training with APIs should result in a smaller model as the main model no longer needs to learn math or trivia (in theory).

Yardanico t1_jdstn3v wrote on March 26, 2023 at 10:32 PM

#2,378,821

Have the author seen https://github.com/hwchase17/langchain? I think this is exactly the problem they're trying to solve.

light24bulbs t1_jdsulyn wrote on March 26, 2023 at 10:39 PM

#2,378,940

Replying to endless_sea_of_stars (#2,377,696)

By "in context learning" i take it you mean zero shot.

Yes, you can hot swap. Id be unsurprised if what Open-AI did is fine tune on how to use plugins in general by giving some examples combined with a little bit of zero-shot primer.

Something trained with ToolFormers technique and then told it can use a new, but similar, plugin is IMO going to generalize way better than something that's never used a plugin before.

sweatierorc t1_jdszzh4 wrote on March 26, 2023 at 11:20 PM

#2,379,680

Replying to rya794 (#2,375,253)

Firefox did, they only lost to another "open-source" project

rya794 t1_jdt0dxe wrote on March 26, 2023 at 11:23 PM

#2,379,732

Replying to sweatierorc (#2,379,680)

That’s a really good counter argument. You may have moved me over to the other side.

endless_sea_of_stars t1_jdtdiar wrote on March 27, 2023 at 1:07 AM

#2,381,586

Replying to light24bulbs (#2,378,940)

Here is what we know about OpenAIs plug-ins. A compact API description gets prepended to the prompt. (In context) Technically it is few shot depending on which definitions you use. We don't know what if any fine-tuning of the model they did to get plug-ins working.

light24bulbs t1_jdtgrjb wrote on March 27, 2023 at 1:34 AM

#2,382,098

Replying to endless_sea_of_stars (#2,381,586)

Based on how much langchain struggles to use tools and gets confused on them, I'd bet on fine tuning. I asked a contact to reveal what they're injecting into the prompt but it's not public information yet so i couldn't get it

endless_sea_of_stars t1_jdtik00 wrote on March 27, 2023 at 1:49 AM

#2,382,373

Replying to light24bulbs (#2,382,098)

It is mostly public information. The API developer is required to make a specification document that describes the API. This gets injected into the prompt. They may transform it from json to something the model better understands. It may also inject some other boilerplate text.

light24bulbs t1_jdtiq9w wrote on March 27, 2023 at 1:51 AM

#2,382,398

Replying to endless_sea_of_stars (#2,382,373)

I'm aware of that part. The wording of the test that's injected is not public. If it was, if use it in my langchain scripts.

Again i really expect there's fine-tuning, we will see eventually maybe.

AngusDHelloWorld t1_jdtq232 wrote on March 27, 2023 at 2:54 AM

#2,383,443

Replying to rya794 (#2,379,732)

And not everyone care about open source. At least for the non technical people, as long as they can get things done, it’s good enough for them.

alexmin93 t1_jduoxj4 wrote on March 27, 2023 at 10:04 AM

#2,387,668

Replying to ThirdMover (#2,375,320)

The problem is not the model but the training dataset. That's the thing that costs millions for OpenAI. Alpacca is rather poorly performing mostly due to the fact its trained on gtp 3 generated texts

alexmin93 t1_jdup63s wrote on March 27, 2023 at 10:08 AM

#2,387,694

Replying to light24bulbs (#2,382,098)

Do you have GPT-4 API? Afaik plugins run on GPT-4 which even in current state is way better at following formal rules. But it's likely that they've indeed fine tuned it to make decisions to use tools

light24bulbs t1_jduuuep wrote on March 27, 2023 at 11:19 AM

#2,388,439

Replying to alexmin93 (#2,387,694)

I do, still struggling with it

[deleted] t1_jduysn7 wrote on March 27, 2023 at 12:00 PM

#2,388,998

Replying to light24bulbs (#2,388,439)

[removed]

nuke-from-orbit t1_je7wkvr wrote on March 30, 2023 at 1:27 AM

#2,469,008

Replying to light24bulbs (#2,375,567)

That is exactly what is happening now.

[P] Using ChatGPT plugins with LLaMA

Comments

rya794 t1_jdrypjf wrote on March 26, 2023 at 6:51 PM

ThirdMover t1_jdrzd7f wrote on March 26, 2023 at 6:55 PM

rya794 t1_jds0xqs wrote on March 26, 2023 at 7:06 PM

ThirdMover t1_jds1kid wrote on March 26, 2023 at 7:11 PM

mcilrain t1_jds23vc wrote on March 26, 2023 at 7:15 PM

light24bulbs t1_jds3mdl wrote on March 26, 2023 at 7:26 PM

Dwanyelle t1_jds42hs wrote on March 26, 2023 at 7:29 PM

beryugyo619 t1_jds9oz8 wrote on March 26, 2023 at 8:08 PM

rya794 t1_jdsev38 wrote on March 26, 2023 at 8:44 PM

endless_sea_of_stars t1_jdskiit wrote on March 26, 2023 at 9:24 PM

Yardanico t1_jdstn3v wrote on March 26, 2023 at 10:32 PM

light24bulbs t1_jdsulyn wrote on March 26, 2023 at 10:39 PM

sweatierorc t1_jdszzh4 wrote on March 26, 2023 at 11:20 PM

rya794 t1_jdt0dxe wrote on March 26, 2023 at 11:23 PM

endless_sea_of_stars t1_jdtdiar wrote on March 27, 2023 at 1:07 AM

light24bulbs t1_jdtgrjb wrote on March 27, 2023 at 1:34 AM

endless_sea_of_stars t1_jdtik00 wrote on March 27, 2023 at 1:49 AM

light24bulbs t1_jdtiq9w wrote on March 27, 2023 at 1:51 AM

AngusDHelloWorld t1_jdtq232 wrote on March 27, 2023 at 2:54 AM

alexmin93 t1_jduoxj4 wrote on March 27, 2023 at 10:04 AM

alexmin93 t1_jdup63s wrote on March 27, 2023 at 10:08 AM

light24bulbs t1_jduuuep wrote on March 27, 2023 at 11:19 AM

[deleted] t1_jduysn7 wrote on March 27, 2023 at 12:00 PM

nuke-from-orbit t1_je7wkvr wrote on March 30, 2023 at 1:27 AM