FermiAnyon
FermiAnyon t1_jegiycj wrote
Reply to comment by turnip_burrito in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
Pretty neat stuff. Fits well with the conversation we were having. I guess a salient question how large an embedding space do you need before performance in any given task plateaus.
Except that they're not random vectors in the original context.
FermiAnyon t1_jegh3hd wrote
Reply to comment by monks-cat in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
In this case, I'm using a fuzzy word "concept" to refer to anything that's differentiable from another thing. That includes things like context and semantics and whether a word is polysemantic and even whether things fit a rhyme scheme. Basically anything observable.
But again, I'm shooting from the hip
FermiAnyon t1_jegfn2j wrote
Reply to comment by Ricenaros in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
I think you should spend more time talking because you've lost me and I don't know what we're talking about. My point has nothing to do with this. Is this a new conversation?
FermiAnyon t1_jeetqrm wrote
Reply to comment by derpderp3200 in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
I did say "basically". The point is it's finite and then we do lots of filtering and interpreting. But based on those inputs, we develop some kind of representation of the world and how we do that is completely mysterious to me, but I heard someone mention that maybe we use our senses to kind of "fact check" each other to develop more accurate models of our surroundings.
I figure multi modal models are really going to be interesting...
FermiAnyon t1_jee3nc1 wrote
Reply to comment by turnip_burrito in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
What did you prompt it with? And what do you think of its answer?
FermiAnyon t1_jee34lx wrote
Reply to comment by mattsverstaps in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
Glad you're here. This would be a really interesting chat for like a bar or a meetup or stunting ;)
But yeah, I'm just giving my impressions. I don't want to make any claims of authority or anything as I'm self taught with this stuff...
But yeah, I have no idea how our brains do it, but when you're building a model whether it's a neural net or you're just factoring a matrix, you'll end up with a high dimensional representation that'll get used as an input to another layer or that'll just be used straight away for classification. It may be overly broad, but I think of all of those high dimensional representations as embeddings and the dimensionality available for encoding an embedding as the embedding space.
Like if you were into sports and you wanted to organize your room so that distance represents relationships between equipment. Maybe the baseball is right next to the softball and the tennis racket is close to the table tennis paddle, but they're a little farther away from the baseball stuff, then you've got some golf clubs and they're kind of in one area of the room because they all involve hitting things with another thing. Then your kite flying stuff and your fishing stuff and your street luge stuff is kind of as far apart as possible from the other stuff because it's not obvious to me anyway that they're related. Your room is a two dimensional embedding space.
When models do it, they just do it with more dimensions and more concepts, but they learn where to put things so that the relationships are properly represented and they just learn all that from lots of cleverly crafted examples.
FermiAnyon t1_jee0e86 wrote
Reply to comment by turnip_burrito in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
Sounds legit :)
FermiAnyon t1_jee03oc wrote
Reply to comment by turnip_burrito in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
My pretty tenuous grasp of the idea makes me thing stuff like... if you're measuring Euclidean distance or cosine similarity between two points that represent concepts that are completely unrelated, what would that distance or that angle be? And that, ideally, all things that are completely unrelated, if you did a pairwise comparison, would have that distance or that angle. And that the embedding space is large enough to accommodate that. And it sounds to me like kind of a limit property that it may only be possible to approximate because there's like lots of ideas and only so many dimensions to fit them in...
FermiAnyon t1_jebpsg3 wrote
Reply to comment by mattsverstaps in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
Yeah, isotropic as in being the same in all directions. So we're probably all familiar with embedding space and the fact that the positional relationships between concepts in embedding space basically encodes information about those relationships. Isotropy in language models refers to the extent to which concepts which are actually unrelated appear unrelated in embedding space.
In other words, a model without this property might havre an embedding space that isn't large enough, but you're still teaching it things and the result is that you're cramming things into your embedding space that's too small, so unrelated concepts are no longer equidistant from other unrelated concepts, implying a relationship that doesn't really exist with the result being that the language model confuses things that shouldn't be confused.
Case in point: I asked chatgpt to give me an example build order for terrans in Broodwar and it proceeded to give me a reasonable sounding build order, except that it was mixing in units from Starcraft 2. Now no human familiar with the games would confuse units like that. I chalk that up to a lack of relevant training data, possibly mixed with an embedding space that's not large enough for the model to be isotropic.
That's my take anyway. I'm still learning ;) please someone chime in and fact check me :D
FermiAnyon t1_jeauumy wrote
This topic in general is super interesting...
So the big difference between humans and these large transformers, on paper, is that humans learn to model things in their environments whether it's tools or people or whatever and it's on that basis that we use analogy and make predictions about things. But we ultimately interact with a small number of inputs, basically our five senses... so the thing I find super interesting is the question of whether these models, even ones that just interact with text, are learning to model just the text itself or if they're actually learning models of things that, with more data/compute would enable them to model more things...
I guess the question at hand is whether this ability to model things and make analogies and abstract things is some totally separate process that we haven't started working with yet, or whether it's an emergent property of just having enough weights to basically be properly isotropic with regard to the actual complexity of the world we live in
FermiAnyon t1_je9ygm5 wrote
Yeah, I've had bilingual conversations with chatgpt.
Also, ask it a question and when it tells you some paragraphs, say "can you rewrite that in Japanese" or whatever you speak.
Pretty cool
FermiAnyon t1_jduvxst wrote
Reply to comment by Matthew2229 in Have deepfakes become so realistic that they can fool people into thinking they are genuine? [D] by [deleted]
Ooh, text... that's a really good point. Okay okay. I'm happy to jog that back then.
FermiAnyon t1_jdtvv5w wrote
Reply to comment by kduyehj in Have deepfakes become so realistic that they can fool people into thinking they are genuine? [D] by [deleted]
Yeah, I'm not gonna hang my hat on a year. The most interesting and significant part about all this is that nobody seems to disagree with the claim that it's going to happen eventually and I just find that kind of amazing that we're messing with AI and having this conversation at all. I couldn't have imagined anything like this, well, like you said... 15 years ago.
Who knows what'll happen in the next 15
FermiAnyon t1_jdswreq wrote
Reply to Have deepfakes become so realistic that they can fool people into thinking they are genuine? [D] by [deleted]
Yeah, even if it's not literally the case now, give it another year or two. I recon video evidence in court has maybe another decade of legs
FermiAnyon t1_jeh5jr0 wrote
Reply to comment by Ricenaros in [D] Turns out, Othello-GPT does have a world model. by Desi___Gigachad
Yeah, I was just saying it's a limited number and that the specific number isn't important. The important thing is that there a limited number. That doesn't imply anything about infinity except that infinity is off the table as an option.