Cheap_Meeting
Cheap_Meeting t1_itjeomy wrote
As far as I know, there is no language model that runs in 4 seconds on a CPU and has competitive results.
Cheap_Meeting t1_itb39jn wrote
Reply to comment by rehrev in [R] Scaling Instruction-Finetuned Language Models - Flan-PaLM- Google 2022 - 75.2% on five-shot MMLU / Forecasters expected this SOTA would need until 2024! - Public checkpoints! by Singularian2501
I was wondering the same. Here is the answer:
https://bounded-regret.ghost.io/forecasting-math-and-mmlu-in-2023
Cheap_Meeting t1_ist2j62 wrote
Sometimes places which are not as good are just as difficult to get into as the good places. They are under the illusion that they only hire the best people, but because they don't know what they are doing their hiring criteria are more or less arbitrary.
Cheap_Meeting t1_iqq8oku wrote
Reply to [Discussion] If we had enough memory to always do full batch gradient descent, would we still need rmsprop/momentum/adam? by 029187
Adding to other answers: Even if you had enough memory, if it would still be computationally inefficient. There is a diminishing return from increasing batch size in terms of how much the loss improves each step.
Cheap_Meeting t1_iqq7gcz wrote
Reply to comment by zergling103 in [Discussion] Near term prospects of AGI? by morecoffeemore
People have slightly different definitions, but generally it means an AI which is capable of doing most tasks which a human can perform.
Cheap_Meeting t1_iqptinm wrote
According the community survey in the link below 58% of NLP researchers agree that AGI should be an important concern for the field and 57% agreeing that recent research has advanced toward AGI in some significant way.
https://arxiv.org/pdf/2208.12852.pdf
I personally don't think that we will have AGI within a decade, however, I do believe that AI will create a technological shift which will have a huge impact on society.
Cheap_Meeting t1_ix4l30k wrote
Reply to comment by AlexGrinch in [D] Why do we train language models with next word prediction instead of some kind of reinforcement learning-like setup? by blazejd
I don't think this is a good answer. Modeling the probability distribution of language is not a worthwhile goal by itself. Which is why language modeling was a niche topic for a very long time. The reason that there has been so much interest in large language models in the last couple of years is that they do "learn" language.