nonotan t1_jc53wlz wrote on March 14, 2023 at 2:30 AM

Reply to comment by rePAN6517 in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef

"Smart character" would seem to be an awfully generous description for what you could realistically do with this, especially when mentioned alongside games like GTA, which very much do not revolve around text-based interactions. You can't really do a cutscene with an LLM today (you could have it generate a script, but how are you going to translate that to the screen automatically? that's highly non-trivial), nevermind leverage it to have individual characters actually behaving smartly within the game world.

If you're a game developer, do you want to dedicate the bulk of the user's VRAM/GPU time to text inference to... add some mildly dynamic textual descriptions to NPCs you encounter? Or would you rather use those resources to, y'know, actually render the game world?

rePAN6517 t1_jc585bd wrote on March 14, 2023 at 3:03 AM

> If you're a game developer, do you want to dedicate the bulk of the user's VRAM/GPU time to text inference to... add some mildly dynamic textual descriptions to NPCs you encounter? Or would you rather use those resources to, y'know, actually render the game world?

When you're interacting with an NPC usually you're not moving around much and not paying attention to FPS either. LLM inference would only happen at interaction time and only for a brief second or so per interaction.

Jepacor t1_jc698s6 wrote on March 14, 2023 at 10:37 AM

You can't just snap your fingers and instantly load and start up a multi GB LLM into VRAM while the game is running though.

zackline t1_jc69d50 wrote on March 14, 2023 at 10:39 AM

I am not sure about it, but I have heard that it’s at the moment not possible to use CUDA while running a game because supposedly the GPU needs to enter a different mode or something like that.

If that should indeed be the case it might even be a hardware limitation that prevents this use case on current GPUs.