gopher9 t1_jdlq1jy wrote on March 25, 2023 at 9:35 AM

Reply to comment by michaelthwan_ai in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai

Add RWKV.

gopher9 t1_jbb9f2l wrote on March 7, 2023 at 8:19 PM

Reply to [D] Neat project that would "fit" onto a 4090? by lifesthateasy

RWKV works rather well on 4090.

gopher9 t1_j95hafv wrote on February 19, 2023 at 11:40 AM

Reply to [D] Lack of influence in modern AI by I_like_sources

Neural networks are by design black boxes. You get great performance in exchange of explainability. This does not mean though that you have no control over the result.

> Example Stable Diffusion. You don't like what the eyes look like, yet you don't know how to make them more realistic.

ControlNet allows to guide image generation: https://github.com/lllyasviel/ControlNet.

> Example NLP. The chatbot does not give you logical answers? Try another random model.

Or give it some examples and ask to reason step by step. Alternatively, finetune it on examples. You can also teach LLM to use external tools, thus avoiding using LLM for reasoning.

gopher9 t1_j8hl55i wrote on February 14, 2023 at 10:52 AM

Reply to [Discussion] The need for noise in stable diffusion by AdministrationOk2735

There's a paper that does that and also other transformations as well: https://arxiv.org/pdf/2208.09392.pdf

gopher9 t1_j8d1odf wrote on February 13, 2023 at 12:13 PM

Reply to comment by MrAcurite in [D] Quality of posts in this sub going down by MurlocXYZ

Did you take a look as Mathstodon? There are some actuall mathematicians and computer scientists there, so maybe it's a better place to look at.

gopher9 t1_j8d1ce0 wrote on February 13, 2023 at 12:09 PM

Reply to comment by dustintran in [D] Quality of posts in this sub going down by MurlocXYZ

/r/math uses extensive moderation to deal with this kind of problem. Low effort post just get removed.

gopher9 t1_j7fq4kw wrote on February 6, 2023 at 1:53 PM

Reply to comment by m98789 in [D] List of Large Language Models to play with. by sinavski

With RWKV-4-Pile-14B-20230204-7324.pth released 2 hours ago, as you can see at https://huggingface.co/BlinkDL/rwkv-4-pile-14b/tree/main.

But yeah, it's still WIP.

gopher9 t1_j7cbdlg wrote on February 5, 2023 at 7:07 PM

Reply to [D] List of Large Language Models to play with. by sinavski

RWKV 14B, trained on The Pile.