BitterAd9531 t1_jap830m wrote on March 3, 2023 at 1:33 AM

Reply to [D] offline speech to text - trainable by AlexSpace3

BitterAd9531 t1_j5tuehz wrote on January 25, 2023 at 3:07 PM

Reply to [D]Are there any known AI systems today that are significantly more advanced than chatGPT ? by Xeiristotle

https://blog.google/technology/ai/lamda/

It's supposed to be ahead of OpenAI's current GPT. Seems only logical to me since it was Google that invented and open-sourced the Transformer model and they likely have much more and much higher quality data than OpenAI for training.

BitterAd9531 t1_j5idapl wrote on January 23, 2023 at 5:09 AM

Reply to comment by Historical-Coat5318 in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

>trivially obvious that AI should never be open-source

Wow. Trivially obvious? I'd very much like to know how that statement is trivially obvious, because it goes against what pretty much every single expert in this field advocates.

Obviously open-source AI brings problems, but what is the alternative? A single entity controlling one of the most disrupting technologies ever? And ignoring for a second the obvious problems with that, how would you enforce it? Criminalize open-sourcing of software? Can't say I'm a fan of this line of thinking.

BitterAd9531 t1_j5g52os wrote on January 22, 2023 at 7:49 PM

Reply to comment by [deleted] in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

>Besides that, OP stated that he wants to use a llm for this, not me.

Actually I didn't. If you read my comment you'd understand I would need the LLM to demonstrate the model that does the actual combining (which obviously wouldn't be an LLM). Seeing as there are currently no models that have watermarking, I'd have to write one myself to test the actual model that does the combining to circumvent the watermark. Either you didn't understand this, or you're once again taking single sentences out of context and making semi-valid points that don't have any relevancy to the orignal discussion.

But honestly I feel like this is completely besides the point. I've given you a high-level explanation of how these watermarks can be defeated and you seem to be the only one who does not understand how they work.

BitterAd9531 t1_j5fcby3 wrote on January 22, 2023 at 4:44 PM

Reply to comment by [deleted] in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

>If you think, you can take two watermarked LLMs and 'trivially" combine their output as you stated, explain in detail how you do that in an automated way.

No thank you, I'm not going to write an LLM from scratch for a Reddit argument. And FWIW, I suspect that even if I did, you'd find some way to convince yourself that you're not wrong. You not understanding how this works doesn't impact me nearly enough to care that much. Have a good one.

BitterAd9531 t1_j5fal5s wrote on January 22, 2023 at 4:32 PM

Reply to comment by Historical-Coat5318 in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

>no one seems to be even considering dealing with it in a serious way

Everyone has considered dealing with it, but everyone who understands the technology behind them also knows that it's futile in the long term. The whole point of these LLMs it to mimic human writing as closely as possible and the more they succeed, the more difficult it becomes to detect. They can be used to output both more precise and more variated text.

Countermeasures like watermarks will be trivial to circumvent while at the same time restricting the capabilities and performance of these models. And that's ignoring the elephant in the room, which is that once open-source models come out, it won't matter at all.

>this is the most pressing ethical issue in AI safety today

Why? It's been long known that the difference between AI and human capabilities will diminish over time. This is simply the direction we're going. Maybe it's time to adapt instead of trying to fight something inevitable. Fighting technological progress has never worked before.

People banking on being able to distinguish between AI and humans will be in for a bad time the coming few years.

BitterAd9531 t1_j5f5olr wrote on January 22, 2023 at 3:57 PM

Reply to comment by [deleted] in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

I think you are misunderstanding how these watermarks work. The watermark is encoded in the tokens used and so combining or rewriting will weaken the watermark to the point it can no longer be used to accurately detect. Robust means a few tokens may be changed, but changing enough tokens will have an impact eventually.

The semantics don't change because in language, there are multiple ways to describe the same thing without using the same (order of) words. That's literally what "rewriting" means.

BitterAd9531 t1_j5f2nk5 wrote on January 22, 2023 at 3:36 PM

Reply to comment by [deleted] in [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

I know about OpenAI's research into watermarking. It doesn't contradict anything I said. It's only a matter of time before more models appear and the researchers themselves talk about how it's defeatable by both humans and other models through combinations and rewriting.

BitterAd9531 t1_j5erse4 wrote on January 22, 2023 at 2:11 PM

Reply to [D] Couldn't devs of major GPTs have added an invisible but detectable watermark in the models? by scarynut

Won't work in the long term. OpenAI might have been the first one to release, but we know other companies have better LLMs and others will catch up soon. When that happens, models without watermarks will be released and people who want output without a watermark will use that model.

And even if you somehow force all of them to implement a watermark, it would be trivial to combine outputs of different models to circumvent it. Not to mention that slight rewrites by a human would probably break most watermarks, the same way they break the current GPT detectors.

BitterAd9531 t1_j57z7zd wrote on January 21, 2023 at 1:04 AM

Reply to comment by fwubglubbel in How close are we to singularity? Data from MT says very close! by sigul77

Chinese Room is once again one of these experiments that sound really good in theory but has no practical use whatsoever. It doesn't matter if the AI "understands" or not if you can no longer tell the difference.

It's similar to the "feeling emotions vs emulating emotions" or "being conscious vs acting conscious" discussion. As long as we don't have a proper definition for them, much less a way to test them, the difference doesn't matter in practice.

BitterAd9531 t1_j41gjo4 wrote on January 12, 2023 at 2:57 PM

Reply to comment by CuriousCesarr in [P] Looking for someone with good NN/ deep learning experience for a paid project by CuriousCesarr

Ah my bad. I think you could make it a bit more clear in your post but it's definitely on me for misunderstanding. If the information about the residence was given in the document itself then it becomes a lot more doable.

I still see quite few problems such as neighbourhood, etc. influencing the price, which means you'd need an absolutely huge dataset with very detailed features. And even then I think the accuracy will still not be optimal. Then there's still the issue with scraping competitors data from their website, which I doubt is legal.

It really depends on what this will be used for. Want to use this to recommend houses to potential buyers in a certain price range? Absolutely doable, but it seems completely overkill for an application like that. Want to use it to replace humans who's job it is to give price estimations? Probably not a good idea.

BitterAd9531 t1_j411ihw wrote on January 12, 2023 at 1:03 PM

Reply to comment by ZeroBearing in [P] Looking for someone with good NN/ deep learning experience for a paid project by CuriousCesarr

I'm not even convinced it's possible based on the requirements. You're not going to get structured data. Just pictures of the outside and inside of the house I assume. How are you going to reliably estimate livable space, current state, or even number of rooms when not even all rooms might be properly pictured. You're banking on extracting these features from what I assume to be suboptimal images with high accuracy (very doubtful tbh) and then estimating price based on the features, which is useless if the features aren't extracted properly from the images.

Even if this was possible with high enough accuracy, the dataset you would need for this has be absolutely huge. I really don't believe someone can gather enough in 6 months while simultaneously developing the nn.

And then we're not even talking about the legality of scraping competitors websites to compare them to.

I'm not convinced I could do this in 6 months and I wouldn't do it for that price.