TiredOldCrow t1_iuzp1y3 wrote
I appreciate that the legendary "fast inverse square root" code from Quake 3 gets produced verbatim, comments and all, if you start with "float Q_rsqrt".
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
I'm interested in how practical it will be for a motivated attacker to poison a code generation models with vulnerable code. Also curious to what extent these models produce code that only works with outdated and vulnerable dependencies -- a problem you'll also run into if you naively copy old StackOverflow posts. I've recently been working on threat models in natural language generation, but it seems like threat models in code generation are also going to be interesting.
Edit: Not John Carmack!
ClearlyCylindrical t1_iv0d7hp wrote
the q_rsqrt being produced verbatim is probably due to identical code existing in many areas of the training data.
dojoteef t1_iv0hfoe wrote
Slightly off-topic: I'm a huge John Carmack fan, but he isn't the author of that code. It's just part of engine code that his company released for the game Quake 3 Arena. For details, check out:
TiredOldCrow t1_iv0s1bg wrote
Great read, thanks for that. Updated the comment.
Viewing a single comment thread. View all comments