Submitted by naequs t3_yon48p in MachineLearning
Increasingly large/deep models for Sound/Image/Language/Games are all the rage (you know what I'm talking about). This is concerning on some level:
- Focus shifts to amount of data, instead of curation
- Require more (expensive) hardware to train, out of reach for many
- API-ization of functionality leads to large scale monitoring by centralized providers
Lets take OpenAI Codex / Github Copilot as an example: Disregarding the licensing questions for a bit, amazing as this model is, there are some drawbacks observed when using it:
- It can generate outdated code or API calls, especially for evolving languages
- Known vulnerabilities observed in generated code e.g. MITRE weaknesses
- No local use of the service, unless replicated and self hosted (expensive)
Now my questions are these:
Do you think there is a case to be made for smaller models fed with higher quality data? Can we substantially reduce number of parameters if we do better with the input?
For example a Codex-like model for a single language only.
Or do you think that the pre-training of large models and then refining to task (e.g. GPT or maybe programmer -> specific language) will continue to dominate because we require the amount of parameters for the tasks at hand anyway? An AGI that we just teach "courses" if you like.
Naive_Piglet_III t1_ivevmim wrote
Could you provide a bit more context? Are you referring to specifically the case of language processing and such use cases? Or are you referring to general ML use cases?