bairbs t1_j9agwyq wrote on February 20, 2023 at 2:25 PM

Reply to comment by gurenkagurenda in OpenAI Is Faulted by Media for Using Articles to Train ChatGPT by Tough_Gadfly

Exactly. This is what big tech has been doing already to create legal and ethical data.

The training data is the bottleneck. OpenAI is trying to see if they can pull a fast one by releasing models using copyrighted material

gurenkagurenda t1_j9amk4h wrote on February 20, 2023 at 3:08 PM

They’re not “pulling a fast one”. There’s no precedent here, and there’s a boatload of lawyers who agree that this is fair use. There are also a number who believe that it won’t be. The courts will have to figure it out, but until then, nobody knows how it will play out.

bairbs t1_j9ao00n wrote on February 20, 2023 at 3:18 PM

They actually are. The precedent has been to use public domain material (which is why there are so many fine art style GANs), create your own data, pay for data to be created, pay for existing data, or keep the models private. There are plenty more artists and other jobs than lawyers who know this isn't fair use and will be negatively impacted if these companies are allowed to continue this practice.

gurenkagurenda t1_j9av16n wrote on February 20, 2023 at 4:07 PM

That's not what I mean by precedent. I mean that there is no legal precedent.

bairbs t1_j9aygil wrote on February 20, 2023 at 4:30 PM

Lol, if you think these huge companies don't have teams of lawyers advising them on how to legally create models, you're nuts. OpenAI has everything to gain and nothing to lose by trying to challenge the precedents that are already set.

But keep doing your own research. Maybe they'll hire you (or maybe they already do)

gurenkagurenda t1_j9b8gzz wrote on February 20, 2023 at 5:35 PM

> OpenAI has everything to gain and nothing to lose by trying to challenge the precedents that are already set.

Please cite the case that you're talking about which you claim sets this precedent. Thanks.