Viewing a single comment thread. View all comments

ThatInternetGuy t1_jcs253z wrote

You know ChatGPT and GPT4 licenses forbid using their output data for training competing AI models. What Stanford did was to show proof of concept for their paper, not to open-source the model, at all.

25

frownGuy12 t1_jcsfnh7 wrote

If OpenAI wants people to respect their IP they should take the word “open” out of their name. They scraped our data to train their models after all, it’s not like OpenAI themselves aren’t pushing the boundaries of what’s acceptable when it comes to copyright law.

Legally it’s questionable, but ethically speaking I think it’s a fine idea.

53

throwaway957280 t1_jcsjj07 wrote

Is OpenAI actually legally allowed to do that? How is using their model for training different from training on copyrighted data which all these models do?

19

Anjz t1_jcsktsf wrote

It's probably untested in courts, there's so many loopholes and variables too, what's considered a competing AI model? Companies usually just spew a bunch of stuff in their terms of use, some of which have no legal basis.

19

kex t1_jcsm7kh wrote

I'd say enjoy it while it lasts, at the very least

6

hughperman t1_jcswzfh wrote

Train a model that's designated as non-competing but open, then train another model from the output of that that's competing.

4

starstruckmon t1_jct0s11 wrote

They are. It's less to do with copyright and more to do with the fact that you signed the T&C before using their system ( and then broke ). It's simmilar to the LinkedIn data scraping case where the court ruled that it wasn't illegal for them to scrape ( nor did it violate copyright ) but they still got in trouble ( and had to settle ) because of violating the T&C.

One way around this is to have two parties, one generating and publishing the dataset ( doesn't violate T&C ) and another independant party ( who didn't sign the T&C ) fine-tuning a model on the dataset.

6

RoyalCities t1_jctcu1m wrote

Couldnt it be possible to set up a large community Q/A repositiry then? Just crowdsource whatever it outputs and document collectively.

2

bitchslayer78 t1_jcsz4s3 wrote

No they aren’t , they have no claim on transformers that would be google brain , but you don’t see alphabet throwing a sissy fit

1

yaosio t1_jcsob5z wrote

The output of AI can't be copyrighted so OpenAI has no say in what somebody does with the output.

10

lxe t1_jcsqk7t wrote

Copyright and license terms are different things.

3

yaosio t1_jcsqxwf wrote

If doesn't matter what the license terms say if it can't be enforced.

9

Uptown-Dog t1_jct32n7 wrote

I think you'd be dismayed at how easy it is to enforce these things when you have OpenAI money.

1

objectdisorienting t1_jcsu3xk wrote

Will be interesting to see where lawmakers and courts ultimately land on this, but the current status quo is that AI generated text and images (or any other works) cannot be copyrighted. In other words for now all output is public domain and OpenAI can kick rocks on this. A TOS violation just means you might get banned from using their service lol.

1

VertexMachine t1_jct3b51 wrote

It's most likely enforceable, but even if it's not they can simply ban OP for doing that. if OP is using their API in any way that's important to him, it's something to consider.

1