Cheap_Meeting t1_ix4l30k wrote on November 20, 2022 at 6:35 PM

Reply to comment by AlexGrinch in [D] Why do we train language models with next word prediction instead of some kind of reinforcement learning-like setup? by blazejd

I don't think this is a good answer. Modeling the probability distribution of language is not a worthwhile goal by itself. Which is why language modeling was a niche topic for a very long time. The reason that there has been so much interest in large language models in the last couple of years is that they do "learn" language.

Cheap_Meeting t1_itjeomy wrote on October 24, 2022 at 2:04 AM

Reply to [D] Most efficient open source language model ? by Meddhouib10

As far as I know, there is no language model that runs in 4 seconds on a CPU and has competitive results.

Cheap_Meeting t1_itb39jn wrote on October 22, 2022 at 7:45 AM

Reply to comment by rehrev in [R] Scaling Instruction-Finetuned Language Models - Flan-PaLM- Google 2022 - 75.2% on five-shot MMLU / Forecasters expected this SOTA would need until 2024! - Public checkpoints! by Singularian2501

I was wondering the same. Here is the answer:

https://bounded-regret.ghost.io/forecasting-math-and-mmlu-in-2023

Cheap_Meeting t1_ist2j62 wrote on October 18, 2022 at 2:32 PM

Reply to [D] How frustrating are the ML interviews these days!!! TOP 3% interview joke by Mogady

Sometimes places which are not as good are just as difficult to get into as the good places. They are under the illusion that they only hire the best people, but because they don't know what they are doing their hiring criteria are more or less arbitrary.

Cheap_Meeting t1_iqq8oku wrote on October 2, 2022 at 8:58 AM

Reply to [Discussion] If we had enough memory to always do full batch gradient descent, would we still need rmsprop/momentum/adam? by 029187

Adding to other answers: Even if you had enough memory, if it would still be computationally inefficient. There is a diminishing return from increasing batch size in terms of how much the loss improves each step.

Cheap_Meeting t1_iqq7gcz wrote on October 2, 2022 at 8:41 AM

Reply to comment by zergling103 in [Discussion] Near term prospects of AGI? by morecoffeemore

People have slightly different definitions, but generally it means an AI which is capable of doing most tasks which a human can perform.

Cheap_Meeting t1_iqptinm wrote on October 2, 2022 at 5:42 AM

Reply to [Discussion] Near term prospects of AGI? by morecoffeemore

According the community survey in the link below 58% of NLP researchers agree that AGI should be an important concern for the field and 57% agreeing that recent research has advanced toward AGI in some significant way.

https://arxiv.org/pdf/2208.12852.pdf

I personally don't think that we will have AGI within a decade, however, I do believe that AI will create a technological shift which will have a huge impact on society.