jagedlion

jagedlion t1_j8kbruo wrote

Part of model building is that it compresses well and doesn't need to store the original data. It consumed 45TB of internet, and stores it in its 700GB working memory (the inference engine can be stored in less space, but I cant pin down a specific minimal number).

It has to figure out what's worth remembering (and how to remember it) without access to the test. It studied the general knowledge, but it didn't study for this particular exam.

2

jagedlion t1_j8k0uqa wrote

I mean, humans can't either give you information that they don't have exposure to. We just acquire more data during our normal day to day lives. People also do their best to infer from what they know. They are more willing to code their certainty in their language, sure, but humans also can only work off of the knowledge they have and the connections they can find within.

4

jagedlion t1_j8jy2ud wrote

So it does many of the things you listed.

It greatly compresses the training database into a tiny (by comparison) model. It runs without access to either the internet, nor the original training data. The ability for it to run 'cheaply' is directly related to how complex the model being built is. Keeping the system efficient is important and that's a major limit on the size of what it can store.

It was trained on 45TB of internet data, compressed and filtered down to around 500GB. A very limited size database already. Then it actually goes further to 'learn' the meaning though, so this is actually stored as 175 billion 'weights' which is about 700GB (each weight is 4 bytes). Still though, that's a pretty 'limited' inference size. Not like, do it on your own computer size, but not terrible. They say it costs a few cents per question, so, pretty cheap compared to the costs of actually hiring even a poor quality professional.

It does therefore have to 'study' ahead of time.

The only thing it doesn't do that you listed, is that it reads many sources, not just one. But the rest? It already does it.

2

jagedlion t1_j8jxe3s wrote

Common misconception. It memorizes the data and forms connections in its model. It's sort of like memorization in that way, as it doesn't even store any of the raw information it was trained on. It only stores the predictive model.

This is also why you can implement AI vision algorithms on primitive microcontrollers. They don't have the computational power to solve for the AI model, but once the powerful computer calculates the model, a much simpler one can use it.

2