step21 t1_jabzt1w wrote on February 28, 2023 at 9:49 AM

If you say you had a good understanding until then, what changed? The GPT architecture as far as I know in newer editions didn’t change completely, but made smaller changes and spent a lot of time on better data, better curation/guidelines etc.

[deleted] t1_jac0x53 wrote on February 28, 2023 at 10:05 AM

[removed]

professorlust t1_jacfxvl wrote on February 28, 2023 at 1:05 PM

Regarding ChatGPT, I believe OP is frustrated not by the Transformer architecture but by the improvements made in the inference functionality.

That’s the real “black box” of GPT style LLMs and the least open

jamesj t1_jac05ai wrote on February 28, 2023 at 9:54 AM

Look up Andrej karpathys YouTube videos of building makemore from scratch

Borky_ t1_jac1gy9 wrote on February 28, 2023 at 10:13 AM

he also had videos on building mini chat-gpt, man's a treasure

RingoCatKeeper t1_jac20hg wrote on February 28, 2023 at 10:21 AM

Vote for Midjourney. I don't know how they improved their performance, no paper or publications.

Magnesus t1_jac83a7 wrote on February 28, 2023 at 11:43 AM

There was some discovery made recently about something to do with offset noise during training - people are speculating that MJ did that while others didn't. Here is video explaining how it works: https://m.youtube.com/watch?v=cVxQmbf3q7Q

On the other hand if that was it MJ would be better at generating dark images, so maybe not? Shame they don't share how they do it.

RingoCatKeeper t1_jac8qzr wrote on February 28, 2023 at 11:51 AM

Thanks for the link, this methods sounds workable. The earilest version of MJ results was somehow blurry and noisey, I wonder if it was because of this method.

Far-Butterscotch-436 t1_jad1g2v wrote on February 28, 2023 at 3:46 PM

Just about every discussion i get notifications for is deleted , what s up with that?

[D] What is the most "opaque" popular machine learning model in 2023?

Comments