ThePhantomPhoton

ThePhantomPhoton t1_jdsyzhn wrote

It’s easier to gauge the effectiveness of these large language models within the context of what they are actually doing, and that is repeating language they’ve learned elsewhere, predicated on some prompt provided by the user. They are not “reasoning,” although the language they use can lead us to believe that is the case. If you’re disappointed by their coding, you will certainly be disappointed by their mathematics.

2

ThePhantomPhoton t1_iywi3sq wrote

That’s a very good story! The biggest challenge in building these “chat bots” is that, as the generated text increases in length, they will tend toward “untruths” as they extend outside of their context windows and move from one “scene” to another— for instance, if this story continued, it’s possible we would start to see the Batman begin to use x-ray vision powers and save Lois Lane. It’s a tough nut to crack given finite memory.

18

ThePhantomPhoton t1_iyk2wnq wrote

I think you have a good argument for images, but language is more challenging because we rely on positional encodings (a kind of "time") to provide us with contextual clues which beat out the following form of statistical language model: Pr{x_{t+1}|x_0, x_1, ..., x_{t}} (Edit-- that is, predicting the next word in sequence given all preceding words in the sequence)

2

ThePhantomPhoton t1_iw92cxj wrote

This is very interesting! I’m a fat neck beard who works in medicine, but one of my colleagues who was interested in baseball went on to work for the Boston Red Sox— if you’re interested in these analyses, maybe ping a baseball team or two and see if they’re interested in this kind of work. Very cool topic for ML!

2