Submitted by RadioFreeAmerika t3_122ilav in singularity
masonw32 t1_jdsyi4v wrote
Reply to comment by ArcticWinterZzZ in Why is maths so hard for LLMs? by RadioFreeAmerika
This is only an issue for insanely large numbers though. GPT-4 already performs a ton of multiplications and additions in every layer of every forward pass. You can overfit a much smaller network for multiplication trained on full numbers as tokens, and a GPT-4 like architecture can learn to multiply full numbers for all practical purposes.
It's true that GPT-4 only does a constant number of operations per input though, and asymptotically, the number of operations required to generate the output will scale by O(n log (n)), where n is proportional to the input length. But this is not why it's failing.
ArcticWinterZzZ t1_jdt1h3m wrote
Yes, but we are interested in its general purpose multiplication abilities. If it remembers the results, that's nice, but we can't expect it to do that for every single pair of numbers. And then, what about multiplication with 3 factors? We should start thinking of ways around this limitation.
liqui_date_me t1_jdt48m5 wrote
You would think that GPT would have discovered a general purpose way to multiply numbers, but it really hasn’t, and it isn’t accurate even with chain-of-thought prompting.
I just asked GPT4 to solve this: 87176363 times 198364
The right answer should be 17292652070132 according to wolfram alpha.
According to GPT4 the answer is 17,309,868,626,012.
This is the prompt I used:
What is 87176363 times 198364? Think of the problem step by step and give me an exact answer.
Viewing a single comment thread. View all comments