Submitted by enryu42 t3_122ppu0 in MachineLearning
enryu42 OP t1_jdrzcc9 wrote
Reply to comment by ghostfaceschiller in [D] GPT4 and coding problems by enryu42
Do you mean re-prompt it asking to correct its mistakes? It is hard to try with the current tight limits on GPT4 prompt count, I'll try once API is properly available. But I strongly doubt it'll help much: it's not that the solutions have minor bugs, they're usually just completely wrong, i.e. the model doesn't "get" the idea for the correct solution.
(it might help for some of the problems from the "Beginner" category though, but these aren't that interesting)
ghostfaceschiller t1_jds202e wrote
Yeah it's essentially that at an automated level. Tbh it is powerful enough based on results so far that would actually be really surprised if it did not yield very significant gains in these tests.
I'm sure there will be a paper out doing it in like the next few days, so we'll see
Jeffy29 t1_jdsm90r wrote
>But I strongly doubt it'll help much: it's not that the solutions have minor bugs, they're usually just completely wrong
I strongly doubt that it wouldn't help. I haven't tested GPT-4 in coding but from what I've seen GPT-3 makes a number of simple errors, especially in longer complex code it's almost inevitable. But it's able to quickly identify and correct it when you point it out. GPT-4 not being able to compile and test its own code that is a big limitation that humans don't have. It also can't calculate the math, it's essentially guessing the calculation, but both can be addressed with an external compiler and calculator like Wolfram. Something humans also have access to. There would need to be some time limit imposed so it can't brute force the solution after guessing for a few days but even so I think the improvements would be quite large.
sdmat t1_jdt9ik9 wrote
> There would need to be some time limit imposed so it can't brute force the solution after guessing for a few days
Not exactly unheard of for junior programmers, to be fair.
farmingvillein t1_jdsdalw wrote
> Do you mean re-prompt it asking to correct its mistakes?
Well, re-prompt + asking it to bake test cases upfront and continuously analyze how failures line up with the test cases.
Viewing a single comment thread. View all comments