Comments

You must log in or register to comment.

Dankbubbles123 t1_j9hb2j2 wrote

Eh, doesn’t chat gpt-4 have like 1.4 trillion parameters? It dwarfs this by almost 5 times.

Edit: turns out, I was wrong! :D

−20

GPT-5entient t1_j9hk7td wrote

32k tokens would mean approximately 150 kB of text. That is a decent sized code base! Also with this much context memory the known context saving tricks would work much better so this could be theoretically used to create code bases of virtually unlimited size.

This amazes me and also (being software dev) also scares me...

But, as they say, what a time to be alive!

16

GPT-5entient t1_j9hkupz wrote

There was that very popular but completely unfounded rumor about 100T param count. It was debunked by Sam Altman himself.

If you think about it for just 1 second you would realize that 100T param model would need at least 200 TB of VRAM or 2560 Nvidia A100s...

1

Midnight-Movie t1_j9hv0t7 wrote

>Is anything about gpt4 known? It seems like just a bunch of rumors and not even a release date

I work with someone who has Beta access to GPT-4. He won't tell me much other than it's mind-blowing & that software development will never be the same. He confirms the rumors that it indeed can write an entire piece of software.

6

drekmonger t1_j9hvs1w wrote

Number of parameters is not the whole story. Quality of training material and training time and training techniques matter as much or more.

The larger models require more resources for inference, as well. I'd be more impressed by a model smaller than GPT-3 that performed just as well.

110

ninadpathak t1_j9hxyog wrote

True, we've seen models a tad bit bigger than GPT3 which are so bad, even GPT 2 would blow them out the water.

Think AI21 Jurassic park or whatever they call their largest model. I hate how stupid it is

6

challengethegods t1_j9i1lk3 wrote

>I'd be more impressed by a model smaller than GPT-3 that performed just as well.

from the article: "Aleph Alpha’s model is on par with OpenAI’s GPT-3 davinci model, despite having fewer parameters.", so... you're saying you would be even more impressed if it used even fewer parameters? Anyway I think anyone could guess gpt3 is poorly optimized so it shouldn't be surprising to anyone that plenty of models have matched its performance on some benchmarks with less parameters.

13

Ylsid t1_j9i9tr4 wrote

Great, and is it open source?

15

drekmonger t1_j9iios3 wrote

Heh. I tried their rationalization step with ChatGPT, just with prompting. For their question about the fries and crackers it said the problem is flawed, because there's such a thing as crackers with low or no salt. Also correctly inferred that fries are usually salted, but don't have to be. (of course, it didn't have the picture to go by, which was the point of the research)

Great paper though. Thanks for sharing.

8

amplex1337 t1_j9ir5dt wrote

You know the closer we get to AGI the more that will happen. Every government will want to be the first in control of an ASI which will basically make them the dominant superpower of the world. It will be as dystopian as it sounds.

2

Kafke t1_j9is0o6 wrote

paywalled though and likely will be just as censored. It's also currently not available. So.... who cares?

7

Destiny_Knight t1_j9iupzk wrote

What the actual fuck is that paper? The thing performed better than a human at several different question classes.

At fucking less than one billion parameters. 100x less than GPT 3.5.

Edit: For clarity, I am impressed not angry lol.

12

ironborn123 t1_j9j2512 wrote

All else being equal, number of model parameters does matter. Well funded startups can acquire the needed data, compute resources, and human talent to build the models. Just like how OpenAI beat Google at this game.

1

IluvBsissa t1_j9j5rld wrote

Germany saving Europe again ! No wait..

3

ddeeppiixx t1_j9j9d79 wrote

Of course no. Unless the research is done within a University context (or publicly funded), you won't have the model open source. SD is maybe the exception, and it seems to me like they regret releasing it and are now doing whatever they can to regain control.

8

ddeeppiixx t1_j9jav1p wrote

First they tried to take control of the DF subreddit (Source). Apparently it was solved on good terms.

Also, newer versions are much more controlled in term of what you can generate. No more NSFW allowed, no more "famous artists" based models. They was also rumors about new license terms (not sure if it did happen actually) that essentially provide them with legal power to force users to update to a newer version (as crazy as it sounds). There is a reason that the community is still using 1.5 version over the 2.0 version.

Honestly, the way I see it, Stability AI are not doing it with bad intentions (at least I hope), and are kind of forced to do that, as they are a legal entity and have to address all the threats of legislative actions regarding explicit sexual contents and living artists.

7

No_Ninja3309_NoNoYes t1_j9jtiy0 wrote

Static parameters are meaningless. Human brains are not static until after death. Besides modeling reality requires more than a bit of algebra.

1

Villad_rock t1_j9k4oz6 wrote

Im from germany and I know germany is incompetent anything related to IT, its all about old economy. Don’t get any hopes up.

0

beachmike t1_j9k5205 wrote

There will be both good and bad that comes out as we get closer to AGI, and attain AGI, just like and other technological revolution. To paint it as "either" dystopian or utopian is naive.

1

datsmamail12 t1_j9klf3w wrote

I'm going to make a genuine question because no one ever gave me a clear answer. When will these language models ever start to be useful?

0

kermunnist t1_j9kqsaw wrote

That's because the smaller models are less useful. With neural networks (likely including biological ones) there's a hard trade off between specialized performance and general performance. If these 100+x smaller models were trained on the same data as GPT-3 they would perform 100+x worse on these metrics (maybe not exactly because in this case the model was multimodal which definitely gave a performance advantage). The big reason this model performed so much is because it was fine tuned on problems similar to the ones on this exam where as GPT-3 was fine turned on anything and everything. This means that this model would likely not be a great conversationalist and would probably flounder at most other tasks GPT-3.5 does well on.

5

Artanthos t1_j9l3i0n wrote

It depends. We cannot see the other side of a singularity.

We could have an alignment issue and end up as paper clips.

AI could solve everything from climate change to social inequality by reducing the human race to 50 million Stone Age hunter gatherers.

Or, you could have the top 1% living in a utopia while everyone else is living in a dystopia.

1

Ortus14 t1_j9lhcci wrote

Singularity is approaching fast.

People might not realize that a sufficiently advanced LLM can simulate Ai researchers and programmers. For example, "simulate a thousand of the top Ai researchers, discussing and then programming an AGI".

3

Ylsid t1_j9mil4h wrote

I didn't know China was doing it too! I know Russia recently open sourced one. If their tactic is to undermine western power by open sourcing their next big products, they can keep doing it

1