Freed4ever t1_j3e66lo wrote on January 7, 2023 at 10:51 PM

Reply to comment by suflaj in [D] Will NLP Researchers Lose Our Jobs after ChatGPT? by singularpanda

Thanks. I'm not a researcher, and more curious about the practicality aspect of the technology. So, the problem is wide, so we cannot formally prove, which is fair. However, if I'm interested in the practicality of the tech, I do not necessarily need a formal proof, I just need it to be good enough. So, just use code generation as an example, it is conceivable that it generates a piece of code, then it actually executes the code and then learn about its accuracy, performance, etc. And hence it is self - taught. Looking at another example like say poetry generation, it is conceivable that it generates a poem, publishes it and then crowd source feedbacks to self teach as well?

suflaj t1_j3e9yim wrote on January 7, 2023 at 11:17 PM

Well, my first paragraph covers that.

> So, just use code generation as an example, it is conceivable that it generates a piece of code, then it actually executes the code and then learn about its accuracy, performance, etc. And hence it is self - taught.

It doesn't do that. It learns how to have a conversation. The rest is mostly a result of learning things through learning how to model language. Don't give it too much credit. As said previously, it cannot extrapolate.

Think_Olive_1000 t1_j3toxf1 wrote on January 11, 2023 at 12:12 AM

I think they meant: it is conceivable that in the future it could. i.e. you hook an LLM up with a repl. https://youtu.be/pdSfgRYy8Ao take at look at 15 minutes in. I could easily see how you could fine tune using self appraisal by executing code.

suflaj t1_j3tq0u2 wrote on January 11, 2023 at 12:19 AM

Sure you could. But the cost is so much it probably outweighs the benefits. And that is even if you made training stable (we already know based on recurrent networks, GANs and even transformers that they're not particularly stable). Hooking it up to the repl would make the task essentially reinforcement learning. And if you know something about reinforcement learning, you know that it generally doesn't work because the environment the agent has to traverse is too difficult to learn anything - what Deepmind managed to achieve with their chess and go engines is truly remarkable, but these are THEIR achievements despite the hardships RL introduces. This is not the achievement of RL. Meanwhile ChatGPT is mostly an achievement of a nice dataset, a clever task and deep learning. It is not that impressive from an engineering standpoint (other than syncing up all the hardware to preprocess the data and train it)

Unless LLMs are extremely optimized in regards to latency and cost, or unless compute becomes even more cheaper (not likely), they have no practical future for the consumer.

So far, it's still a dick measuring contest, as if a larger model and dataset will make much of a difference. I do not see much interest in making them more usable or accessible, I see only effort in beating last year's paper and getting investors to dump more money into a bigger model for next year. I also see ChatGPT as being a cheap marketing scheme all the while it's being used for some pretty nefarious things, some of them being botted Russian or Ukrainian war propaganda.

So you can forget the repl idea. Who would it serve? Programmers have shown they are not willing to pay for something like GitHub Copilot. Large companies can always find people to hire and do programming for them. Unless these are strides in something very expensive, like formal verification, it's not something a large company, the one that has the resources to research LLMs, would go into.

Maybe the next step is training it on WolframAlpha. But at that point you're just catching up to almost 15 year old software. Maybe that "almost 15 year old" shows you how overhyped ChatGPT really is for commercial use.

Think_Olive_1000 t1_j3tqojo wrote on January 11, 2023 at 12:24 AM

Nah, there's already work that can reduce generic LLM model size by a half and not lose any performance. And LLMs I think will be great as foundation models for training more niche smaller models for narrower tasks - people already use openAIs API to generate data to fine-tune their own niche models. I think we'll look back at current LLMs and realise just how inefficient they were - though a necessary evil to prove that something like this CAN be done.

suflaj t1_j3twskh wrote on January 11, 2023 at 1:06 AM

Half is not enough. We're thinking in the order of 100x or even more. Do not forget that even ordinary BERT is not really commercially viable as-is.

I mean sure you can use them to get a nicer distribution for your dataset. But at the end of the day the API is too slow to train any "real" model, and you can already probably collect and generate data for smaller models yourself. So as a replacement for lazy people - sure, I think ChatGPT by itself probably has the potential to solve most repetitive questions people have on the internet. But it won't be used like that at scale so ultimately it is not useful.

If it wasn't clear enough by now, I'm not skeptic because of what LLMs are, but how they simply do not scale up to real-world requirements. Ultimately, people do not have datacenters at home, and OpenAI and other vendors do not have the hardware for any actual volume of need other than a niche, hobbyist one. And the investment to develop something like ChatGPT is too big to justify for that use.

All of this was ignoring the obvious legal risks from using ChatGPT generations commercially!

Think_Olive_1000 t1_j3u3k7w wrote on January 11, 2023 at 1:53 AM

Bert is being used by Google for search under the hood. It's how theyve got that instant fancy extractive answers box. I don't disagree that LLMs are large. So was Saturn V.

suflaj t1_j3u4smq wrote on January 11, 2023 at 2:02 AM

Google's BERT use is not a commercial, consumer product, it is an enterprise one (Google uses it and runs it on their hardware), they presumably use the large version or something even larger than the pretrained weights available on the internet and to achieve latencies they have they are using datacentres and non-trivial distribution schemes for it, not just consumer hardware.

Meanwhile, your average CPU will need anywhere from 1-4 seconds to do one inference pass in onnx runtime, of course much less on a GPU, but to be truly cross platform you're targetting JS in most cases, which means CPU and not a stack as mature as what Python/C++/CUDA have.

What I'm saying is:

people have said no to paid services, they want free products
consumer hardware has not scaled nearly as fast as DL
even ancient models are still too slow to run on consumer hardware after years of improvement
distilling, quantizing and optimizing them seems to get them to run just fast enough to not be a nuisance, but is often too tedious to work out for a free product