drekmonger t1_j9hvs1w wrote on February 22, 2023 at 2:11 AM

Number of parameters is not the whole story. Quality of training material and training time and training techniques matter as much or more.

The larger models require more resources for inference, as well. I'd be more impressed by a model smaller than GPT-3 that performed just as well.

Hands0L0 t1_j9i277j wrote on February 22, 2023 at 2:59 AM

I for one welcome competition in the race to AGI

xott t1_j9i2zg3 wrote on February 22, 2023 at 3:05 AM

It's the new Space Race

fumblesmcdrum t1_j9ib2lt wrote on February 22, 2023 at 4:10 AM

latent space race

Hands0L0 t1_j9jqo4a wrote on February 22, 2023 at 1:49 PM

Fuck dude, that's clever

phoenixmusicman t1_j9imtgj wrote on February 22, 2023 at 6:04 AM

It's not a space race until Governments start pouring massive amounts of their GDP into it.

Artanthos t1_j9l2n4a wrote on February 22, 2023 at 7:29 PM

China is pouring money into AI research.

spreadlove5683 t1_j9i58z2 wrote on February 22, 2023 at 3:22 AM

I sure don't. We need to get it right. Not barrel ahead in an arms race.

amplex1337 t1_j9ir5dt wrote on February 22, 2023 at 6:54 AM

You know the closer we get to AGI the more that will happen. Every government will want to be the first in control of an ASI which will basically make them the dominant superpower of the world. It will be as dystopian as it sounds.

[deleted] t1_j9j6qqt wrote on February 22, 2023 at 10:23 AM

[deleted]

beachmike t1_j9k5205 wrote on February 22, 2023 at 4:02 PM

There will be both good and bad that comes out as we get closer to AGI, and attain AGI, just like and other technological revolution. To paint it as "either" dystopian or utopian is naive.

Artanthos t1_j9l3i0n wrote on February 22, 2023 at 7:34 PM

It depends. We cannot see the other side of a singularity.

We could have an alignment issue and end up as paper clips.

AI could solve everything from climate change to social inequality by reducing the human race to 50 million Stone Age hunter gatherers.

Or, you could have the top 1% living in a utopia while everyone else is living in a dystopia.

Ziggy5010 t1_j9j1id1 wrote on February 22, 2023 at 9:09 AM

Agreed

dangeratio t1_j9igzb4 wrote on February 22, 2023 at 5:03 AM

Check out Amazon’s multimodal chain of thought model, only 738 million and scores better on all question classes than ChatGPT. See table 4 on page 7 here - https://arxiv.org/pdf/2302.00923.pdf

Destiny_Knight t1_j9iupzk wrote on February 22, 2023 at 7:38 AM

What the actual fuck is that paper? The thing performed better than a human at several different question classes.

At fucking less than one billion parameters. 100x less than GPT 3.5.

Edit: For clarity, I am impressed not angry lol.

IluvBsissa t1_j9j5t08 wrote on February 22, 2023 at 10:10 AM

Are you angry or impressed ?

Destiny_Knight t1_j9j6iq0 wrote on February 22, 2023 at 10:20 AM

impressed lol

IluvBsissa t1_j9j6v5v wrote on February 22, 2023 at 10:24 AM

If these models are so smol and efficient, why are they not released ?? I just don't get it. I thought PaLM was kept private because it was too costly to run to be profitable...

kermunnist t1_j9kqsaw wrote on February 22, 2023 at 6:17 PM

That's because the smaller models are less useful. With neural networks (likely including biological ones) there's a hard trade off between specialized performance and general performance. If these 100+x smaller models were trained on the same data as GPT-3 they would perform 100+x worse on these metrics (maybe not exactly because in this case the model was multimodal which definitely gave a performance advantage). The big reason this model performed so much is because it was fine tuned on problems similar to the ones on this exam where as GPT-3 was fine turned on anything and everything. This means that this model would likely not be a great conversationalist and would probably flounder at most other tasks GPT-3.5 does well on.

drekmonger t1_j9iios3 wrote on February 22, 2023 at 5:20 AM

Heh. I tried their rationalization step with ChatGPT, just with prompting. For their question about the fries and crackers it said the problem is flawed, because there's such a thing as crackers with low or no salt. Also correctly inferred that fries are usually salted, but don't have to be. (of course, it didn't have the picture to go by, which was the point of the research)

Great paper though. Thanks for sharing.

challengethegods t1_j9i1lk3 wrote on February 22, 2023 at 2:54 AM

>I'd be more impressed by a model smaller than GPT-3 that performed just as well.

from the article: "Aleph Alpha’s model is on par with OpenAI’s GPT-3 davinci model, despite having fewer parameters.", so... you're saying you would be even more impressed if it used even fewer parameters? Anyway I think anyone could guess gpt3 is poorly optimized so it shouldn't be surprising to anyone that plenty of models have matched its performance on some benchmarks with less parameters.

ninadpathak t1_j9hxyog wrote on February 22, 2023 at 2:27 AM

True, we've seen models a tad bit bigger than GPT3 which are so bad, even GPT 2 would blow them out the water.

Think AI21 Jurassic park or whatever they call their largest model. I hate how stupid it is

musing2020 t1_j9it8e1 wrote on February 22, 2023 at 7:19 AM

Achieving GPT 175B Level Accuracy with a 10x More Efficient Model

https://sambanova.ai/blog/achieving-gpt-175b-level-accuracy-with-a-10x-more-efficient-model/

Professional-Song216 t1_j9hwijh wrote on February 22, 2023 at 2:16 AM

Great way to look at it, it’s much more important to squeeze the maximum out of your system. Efficiency over excess

burnt_umber_ciera t1_j9iqusp wrote on February 22, 2023 at 6:50 AM

Are you aware of either the "training material" or "training time" or "training techniques" utilized?

Zer0D0wn83 t1_j9iwxqu wrote on February 22, 2023 at 8:06 AM

I'm sure they've read those papers too, you know.

ironborn123 t1_j9j2512 wrote on February 22, 2023 at 9:18 AM

All else being equal, number of model parameters does matter. Well funded startups can acquire the needed data, compute resources, and human talent to build the models. Just like how OpenAI beat Google at this game.

sonderlingg t1_j9itgq3 wrote on February 22, 2023 at 7:22 AM

Artificial German Intelligence

H0sh1z0r4 t1_j9jjn56 wrote on February 22, 2023 at 12:50 PM

never ask AGI what is was doing in 1939

InsideATurtlesMind t1_j9k57j3 wrote on February 22, 2023 at 4:03 PM

Künstliche Allgemeine Intelligenz 🇩🇪

myusernameblabla t1_j9l2v3q wrote on February 22, 2023 at 7:30 PM

He Kai, sag mal was gescheites.

Benderisgreat4 t1_j9ju7ap wrote on February 22, 2023 at 2:16 PM

Surely it won't be funny...

SupportstheOP t1_j9ldmse wrote on February 22, 2023 at 8:36 PM

Inb4 The Germans literally create Funny-Bot

ML4Bratwurst t1_j9hxzv8 wrote on February 22, 2023 at 2:27 AM

It's not about the size;)

Twinkies100 t1_j9ik1qr wrote on February 22, 2023 at 5:34 AM

r/thatswhatshesaid

bluehands t1_j9inzl3 wrote on February 22, 2023 at 6:17 AM

You obviously do not know my ex.

Ylsid t1_j9i9tr4 wrote on February 22, 2023 at 3:59 AM

Great, and is it open source?

ddeeppiixx t1_j9j9d79 wrote on February 22, 2023 at 10:58 AM

Of course no. Unless the research is done within a University context (or publicly funded), you won't have the model open source. SD is maybe the exception, and it seems to me like they regret releasing it and are now doing whatever they can to regain control.

needle1 t1_j9j9nyr wrote on February 22, 2023 at 11:02 AM

Hm? Care to elaborate on what they’re doing to “regain control?”

ddeeppiixx t1_j9jav1p wrote on February 22, 2023 at 11:17 AM

First they tried to take control of the DF subreddit (Source). Apparently it was solved on good terms.

Also, newer versions are much more controlled in term of what you can generate. No more NSFW allowed, no more "famous artists" based models. They was also rumors about new license terms (not sure if it did happen actually) that essentially provide them with legal power to force users to update to a newer version (as crazy as it sounds). There is a reason that the community is still using 1.5 version over the 2.0 version.

Honestly, the way I see it, Stability AI are not doing it with bad intentions (at least I hope), and are kind of forced to do that, as they are a legal entity and have to address all the threats of legislative actions regarding explicit sexual contents and living artists.

Ylsid t1_j9jamyy wrote on February 22, 2023 at 11:14 AM

Unfortunate, but I figured. Something's up when the Russians are the only ones releasing LLMs

MysteryInc152 t1_j9lj5ef wrote on February 22, 2023 at 9:09 PM

The GLM models are from China and open sourced.

Ylsid t1_j9mil4h wrote on February 23, 2023 at 1:07 AM

I didn't know China was doing it too! I know Russia recently open sourced one. If their tactic is to undermine western power by open sourcing their next big products, they can keep doing it

WeedWacker25 t1_j9jgt2d wrote on February 22, 2023 at 12:23 PM

I had to do some research about this. I found that the transformers are mostly open source, but the trained models are not.

Ylsid t1_j9jgvso wrote on February 22, 2023 at 12:24 PM

Boo

Kafke t1_j9is0o6 wrote on February 22, 2023 at 7:04 AM

paywalled though and likely will be just as censored. It's also currently not available. So.... who cares?

Thorusss t1_j9nrh2c wrote on February 23, 2023 at 7:52 AM

Geopolitics cares

Private_Island_Saver t1_j9j73ot wrote on February 22, 2023 at 10:28 AM

Like what would happen if like 4% of global GDP was put into this?

IluvBsissa t1_j9j5rld wrote on February 22, 2023 at 10:09 AM

Germany saving Europe again ! No wait..

Ortus14 t1_j9lhcci wrote on February 22, 2023 at 8:58 PM

Singularity is approaching fast.

People might not realize that a sufficiently advanced LLM can simulate Ai researchers and programmers. For example, "simulate a thousand of the top Ai researchers, discussing and then programming an AGI".

Thorusss t1_j9nrlcs wrote on February 23, 2023 at 7:53 AM

any sufficiently advanced LLM is indistinguishable from trueAGI™?

VeganPizzaPie t1_j9hwb1w wrote on February 22, 2023 at 2:15 AM

Gesundheit

[deleted] t1_j9htnkv wrote on February 22, 2023 at 1:55 AM

[removed]

No_Ninja3309_NoNoYes t1_j9jtiy0 wrote on February 22, 2023 at 2:11 PM

Static parameters are meaningless. Human brains are not static until after death. Besides modeling reality requires more than a bit of algebra.

[deleted] t1_j9iin71 wrote on February 22, 2023 at 5:20 AM

[deleted]

Villad_rock t1_j9k4oz6 wrote on February 22, 2023 at 3:59 PM

Im from germany and I know germany is incompetent anything related to IT, its all about old economy. Don’t get any hopes up.

Honest_Science t1_j9kx3c4 wrote on February 22, 2023 at 6:55 PM

That is really a nice one

Honest_Science t1_j9kx92l wrote on February 22, 2023 at 6:56 PM

I tried to use their system on their playground. It takes a lot of prompting to get anything sensible out of it.

Thorusss t1_j9nro02 wrote on February 23, 2023 at 7:54 AM

The German CovidApp was surprisingly solid

datsmamail12 t1_j9klf3w wrote on February 22, 2023 at 5:44 PM

I'm going to make a genuine question because no one ever gave me a clear answer. When will these language models ever start to be useful?

Thorusss t1_j9nrssl wrote on February 23, 2023 at 7:56 AM

They are useful now.

Two friend of mine use ChatGPT for work.

Gold-and-Glory t1_j9hyg48 wrote on February 22, 2023 at 2:31 AM

And no bias?

Liktwo t1_j9io4z9 wrote on February 22, 2023 at 6:18 AM

NEIN!

Thorusss t1_j9nrqp1 wrote on February 23, 2023 at 7:55 AM

of there are many biases.

These neural networks mostly consistent of weights and biases.

Gold-and-Glory t1_j9o32jz wrote on February 23, 2023 at 10:34 AM

Not this bias, the other that reddit agrees with and downvote you if you mention, like a religious dogma.

Akimbo333 t1_j9hutcu wrote on February 22, 2023 at 2:04 AM

Interesting

Dankbubbles123 t1_j9hb2j2 wrote on February 21, 2023 at 11:41 PM

Eh, doesn’t chat gpt-4 have like 1.4 trillion parameters? It dwarfs this by almost 5 times.

Edit: turns out, I was wrong! :D

Buck-Nasty t1_j9hb9az wrote on February 21, 2023 at 11:43 PM

Gpt4s parameter counts aren't known yet

Dankbubbles123 t1_j9hbbp1 wrote on February 21, 2023 at 11:43 PM

Ah okay, nvm then. Sorry

Buck-Nasty t1_j9hch2c wrote on February 21, 2023 at 11:51 PM

The context window is apparently massive though, more than 10 times the size of gpt3, it could potentially write whole novels at that scale

https://mobile.twitter.com/transitive_bs/status/1628118163874516992?s=46&t=Biiqy66Cy9oPH8c1BL6_JQ

hydraofwar t1_j9hgim5 wrote on February 22, 2023 at 12:20 AM

A credible researcher had commented that ChatGPT can write code, and GPT-4 could write entire programs.

GPT-5entient t1_j9hk7td wrote on February 22, 2023 at 12:46 AM

32k tokens would mean approximately 150 kB of text. That is a decent sized code base! Also with this much context memory the known context saving tricks would work much better so this could be theoretically used to create code bases of virtually unlimited size.

This amazes me and also (being software dev) also scares me...

But, as they say, what a time to be alive!

GPT-5entient t1_j9hji5i wrote on February 22, 2023 at 12:41 AM

Wow, yeah, this looks amazing. My biggest issue with GPT-3 is the relatively small context window. This will open so many new possibilities.

Practical-Mix-4332 t1_j9hg2cr wrote on February 22, 2023 at 12:16 AM

Is anything about gpt4 known? It seems like just a bunch of rumors and not even a release date

Midnight-Movie t1_j9hv0t7 wrote on February 22, 2023 at 2:05 AM

>Is anything about gpt4 known? It seems like just a bunch of rumors and not even a release date

I work with someone who has Beta access to GPT-4. He won't tell me much other than it's mind-blowing & that software development will never be the same. He confirms the rumors that it indeed can write an entire piece of software.

farcetragedy t1_j9hzfq1 wrote on February 22, 2023 at 2:38 AM

That’s exciting. Would be amazing if the next one didn’t just make shit up when it doesn’t know the answer

Practical-Mix-4332 t1_j9hxkf3 wrote on February 22, 2023 at 2:24 AM

Oh great another rumor

Midnight-Movie t1_j9hy4uz wrote on February 22, 2023 at 2:28 AM

Well... You asked if anything was known. I gave you info from a coworker with beta access. My apologies if my info didn't come with a boquet of roses and a handwritten card.

Practical-Mix-4332 t1_j9i0ctk wrote on February 22, 2023 at 2:45 AM

I understand you’re trying to help, but this being Reddit and all there’s no way we can trust what you are saying or take it officially as something “known”. No offense though.

MysteryInc152 t1_j9j9dvt wrote on February 22, 2023 at 10:58 AM

32k context window it seems.

https://mobile.twitter.com/transitive_bs/status/1628118163874516992?s=20

GPT-5entient t1_j9hkupz wrote on February 22, 2023 at 12:50 AM

There was that very popular but completely unfounded rumor about 100T param count. It was debunked by Sam Altman himself.

If you think about it for just 1 second you would realize that 100T param model would need at least 200 TB of VRAM or 2560 Nvidia A100s...

BlueMoon_Josh t1_j9hs5v1 wrote on February 22, 2023 at 1:44 AM

It was 5 Morbillion parameters

bass6c t1_j9hfy9w wrote on February 22, 2023 at 12:16 AM

Chatgpt is based on gpt3 a 175 billion parameters model.

Comments