astrange
astrange t1_jdujlcf wrote
Reply to comment by was_der_Fall_ist in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9
This is why people are wrong when they say GPT "just outputs the most probable next word". It's the most probable /according to itself/, and the model has been trained to lie such that the most useful word is the most probable one.
astrange t1_jb6hn1a wrote
Reply to comment by AuspiciousApple in [R] We found nearly half a billion duplicated images on LAION-2B-en. by von-hust
StableDiffusion claims they also dedupe following this, in SD2.X at least.
Though, deduplicating images feels incomplete to me - what if the same thing appears in different images? That's kind of what you want, but also not what you want.
astrange t1_jajpps3 wrote
Reply to comment by VertexMachine in [D] OpenAI introduces ChatGPT and Whisper APIs (ChatGPT API is 1/10th the cost of GPT-3 API) by minimaxir
"They're just gathering data" is literally never true. That kind of data isn't good for anything.
astrange t1_j9se7ri wrote
Reply to [D] To the ML researchers and practitioners here, do you worry about AI safety/alignment of the type Eliezer Yudkowsky describes? by SchmidhuberDidIt
Yud is a millenarian street preacher; his concept of evil superintelligent AGI is half religion and half old SF books they read. It has no resemblance to current research and we aren't going in directions similar to what they imagine we're doing.
(There's not even much reason to believe "superintelligence" is possible, that it would be helpful on any given task, or even that humans are generally intelligent.)
astrange t1_j7oduw3 wrote
Reply to comment by MysteryInc152 in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
This is wishful thinking. ChatGPT, being a computer program, doesn't have features it's not designed to have, and it's not designed to have this one.
(By designed, I mean has engineering and regression testing so you can trust it'll work tomorrow when they redo the model.)
I agree a fine tuned LLM can be a large part of it, but virtual assistants already have LMs and obviously don't always work that well.
astrange t1_j7juabz wrote
Reply to comment by drooobie in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
No they're not. ChatGPT doesn't do anything, it just responds to you. Letting it reliably do things (or even reliably return true responses) can't even clearly use the same technology.
astrange t1_j7ju8m8 wrote
Reply to comment by reditum in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
* with your attention span to look at ads
astrange t1_j7jtrfh wrote
Reply to comment by ktpr in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
ChatGPT's a website and any website can show you ads. Of course, it has the same issue as Gmail where users aren't going to like ads being targeted based on what they say to it.
astrange t1_j0z2ea3 wrote
Reply to Sarcasm Detection model [R]. by Business-Ad6451
This seems like an extremely difficult problem. Humans generally fail to recognize sarcastic journalism all the time; I expect only the original authors could tell for some of it.
(For instance, famous alleged-fraudster SBF has a lot of articles in places like the NYT which most readers think are "good press" for him, but I'm fairly sure are actually the journalists lowkey making fun of him.)
astrange t1_iz7dqpi wrote
Reply to comment by Drooflandia in [D] Stable Diffusion 1 vs 2 - What you need to know by SleekEagle
That's Midjourney. You can download SD all you want.
astrange t1_iy5mm2j wrote
Reply to comment by sam__izdat in [P] Stable Diffusion 2.0 and the Importance of Negative Prompts for Good Results (+ Colab Notebooks + Negative Embedding) by minimaxir
Yeah, "bad anatomy" and things like that come from NovelAI because its dataset has images literally tagged with that. It doesn't work on other models.
SD is scraped off the internet so something that might work is negative keywords associated with websites of images you don't like. Like "zillow" "clipart" "coindesk" etc.
Or try clip-interrogator or textual inversion against bad looking images (but IMO clip-interrogator doesn't work very well yet either).
astrange t1_iwyimtj wrote
Reply to comment by nickstatus in Meta has withdrawn its Galactica AI, only 3 days after its release, following intense criticism. Meta’s misstep—and its hubris—show once again that Big Tech has a blind spot about the severe limitations of large language models in AI. by lughnasadh
Humans do have some instinctive knowledge. The instinctive fear of snakes and spiders, sexual attraction, etc, all rely on recognizing sense data without learning anything first.
astrange t1_irr0xth wrote
Reply to comment by RBUexiste-RBUya in What counts as "observation" in quantumn theory? by Iron_Rod_Stewart
The “spooky action” (instantaneous collapse of the waveform) is part of the Copenhagen interpretation of quantum physics, but isn’t proven to exist, as that’s just one interpretation.
There’s other interpretations that are still valid (many worlds, superdeterminism, pilot wave) and don’t include it, but of course many of those can’t be falsified.
astrange t1_irpk8s2 wrote
Reply to comment by scraper01 in [D] What kind of mental framework/thought process the researchers have when working on solving/proving the math of the new algorithms? by aviisu
Do you have specific examples?
It's obviously true that diffusion models don't work for the reasons they were originally thought to; the cold diffusion paper shows this. Also, Stable Diffusion explainers I've seen explain it using pixel diffusion even though it's latent diffusion. And I'm not sure I understand why latent diffusion works.
astrange t1_jdy6d4f wrote
Reply to comment by Rioghasarig in [D]GPT-4 might be able to tell you if it hallucinated by Cool_Abbreviations_9
But nobody uses the base model, and when they did use it, it was only interesting because it fails to predict the next word and therefore generates new text. A model that successfully predicts the next word all the time given existing text would be overfitting, since it would only produce things you already have.