xEdwin23x
xEdwin23x t1_jcjfnlj wrote
Reply to comment by FallUpJV in [R] RWKV 14B ctx8192 is a zero-shot instruction-follower without finetuning, 23 token/s on 3090 after latest optimization (16G VRAM is enough, and you can stream layers to save more VRAM) by bo_peng
First, this is not a "small" model so size DOES matter. It may not be hundreds billion parameters but it's definitely not small imo.
Second, it always has been (about data) astronaut pointing gun meme.
xEdwin23x t1_jca9kr0 wrote
Reply to comment by NoScallion2450 in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
OpenAI is not a small company either. It may be a "startup" but it's clearly backed by Microsoft and between those two there's probably quite a lot of patents that Google have used in the past too.
xEdwin23x t1_jca8d58 wrote
Reply to comment by NoScallion2450 in [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
It's probably not in their interest as they know they both will end up worse if they decide to follow that path.
xEdwin23x t1_jca7kxx wrote
Reply to [D] What do people think about OpenAI not releasing its research but benefiting from others’ research? Should google meta enforce its patents against them? by [deleted]
Google has probably used stuff from OpenAI too (decoder only GPT-style training or CLIP or Diffusion or Dall-E ideas maybe?). Anyways, it's clear they (and probably every large tech company with big AI teams) are in an arms race at this point. Its definitely not a coincidence Google, OpenAI / Microsoft released on the same day, and we also heard Baidu is releasing sometime these days. Meta and others will be probably following suite. The publicity (and the market share for this new technologies) is worth too much.
xEdwin23x t1_jaq2p2q wrote
Reply to comment by Fuehnix in [N] EleutherAI has formed a non-profit by StellaAthena
They have a list of projects and / or ideas pinned to some of their channels. If you want something to happen then you're expected to be pro-active and lead (or follow someone else who is leading); it's the only way this kind of collaboration can work. Tbf it's very hard to collaborate among people on different time zones with their own schedules but they somehow make it work.
xEdwin23x t1_jals78j wrote
Reply to comment by bjergerk1ng in [D] What are the most known architectures of Text To Image models ? by AImSamy
I'm guessing he refers to this one: https://parti.research.google/
xEdwin23x t1_irv8zfs wrote
Reply to [D] Looking for some critiques on recent development of machine learning by fromnighttilldawn
These question the "progress" or rather the illusion of progress in the field:
Are GANs Created Equal? A Large-Scale Studyhttps://arxiv.org/abs/1711.10337
Do Transformer Modifications Transfer Across Implementations and Applications?
xEdwin23x t1_jd0pc6v wrote
Reply to [D] Determining quality of training images with some metrics by i_sanitize_my_hands
Active learning deals with using a small subset of representative images that should perform as well as a larger set of uncurated images. You can consider looking into that.