Submitted by blacklemon67 t3_11misax in MachineLearning
currentscurrents t1_jbnandw wrote
Reply to comment by harharveryfunny in [D] Why are so many tokens needed to train large language models? by blacklemon67
I think this is the wrong way to think about what LLMs are doing. They aren't modeling the world; they're modeling human intelligence.
The point of generative AI is to model the function that created the data. For language, that's us. You need all these tokens and parameters because modeling how humans think is very hard.
As LLMs get bigger, they can model us more accurately, and that's where all these human-like emergent abilities come from. They build a world model because it's useful for predicting text written by humans who have a world model. Same thing for why they're good at RL and task decomposition, can convincingly fake emotions, and inherit our biases.
Viewing a single comment thread. View all comments