13metalmilitia t1_j6fldte wrote on January 30, 2023 at 12:00 AM

Those are all Microsoft basically. Lol. That’s like saying gm, Chevrolet, Cadillac.

HeavensCriedBlood t1_j6fvw9h wrote on January 30, 2023 at 1:12 AM

Microsoft isn't the parent organization of OpenAI

tylerr514 t1_j6fz9l3 wrote on January 30, 2023 at 1:36 AM

not yet, at least on paper.

t4ct1c4l_j0k3r t1_j6gueuc wrote on January 30, 2023 at 5:35 AM

There is certainly enough financial stake

AadamAtomic t1_j6gvglz wrote on January 30, 2023 at 5:46 AM

Oh no! Microsoft's unlimited wallet that runs half the digital world!

There is no financial stake. Lol

Giblet15 t1_j6hhr11 wrote on January 30, 2023 at 10:41 AM

I'll just leave this here.

https://blogs.microsoft.com/blog/2023/01/23/microsoftandopenaiextendpartnership/

[deleted] t1_j6ilvr0 wrote on January 30, 2023 at 4:28 PM

[removed]

AadamAtomic t1_j6hi1n6 wrote on January 30, 2023 at 10:45 AM

>I'll just leave this here.

That's exactly my point. That's literally pocket lint in the eyes of Microsoft , and you're all calling it a ""financial stake."" Lol

It just goes to show that people have no fucking clue the amount of money these corporations have, because it's literally unfathomable to your human brain.

Giblet15 t1_j6hj1l6 wrote on January 30, 2023 at 10:59 AM

Why does it matter how much they have when defining whether they have a financial stake or not?

$10 or $10B Microsoft has invested in Open AI. They by definition have a financial stake.

$10B is also not a negligible amount to Microsoft. On their most recent 10-Q they had about $100B in cash. This investment was 10% the amount of cash on hand.

AadamAtomic t1_j6hjhxe wrote on January 30, 2023 at 11:05 AM

>Why does it matter how much they have when defining whether they have a financial stake or not?

Because nothing is at stake. It's literally an investment.

If OpenA.I failed miserably, Microsoft would lose practically nothing and forget it ever happened just like their cell phones. Business as usual.

>$10B is also not a negligible amount to Microsoft. On their most recent 10-Q they had about $100B in cash.

My dude.... $100B in cash IS WORTH WAYYYY MORE than credit. This is not the amount of money they have or made. This is simply their Liquidity for throwing at random bullshit like Open.AI.

To Microsoft, it's the equivalent of buying a happy meal from McDonald's. Do you consider that a financial Stake?

Loud-Path t1_j6jfax2 wrote on January 30, 2023 at 7:32 PM

The amount is not relevant. The stake is, if you bothered to read the agreement, that 75% of all profits MUST go back to Microsoft until it is paid back, and then Microsoft owns 49% of the company. I.e. they are defacto a subdivision of Microsoft and are going to be at MS's beck and call. The bank I work for has 49% owned by a single person, guess what, even though he is not a majority stock holder he is considered the owner as he can easily get 2% of other people's votes to go his way.

AadamAtomic t1_j6kev0h wrote on January 30, 2023 at 11:16 PM

>The bank I work for has 49% owned by a single person, guess what, even though he is not a majority stock holder he is considered the owner as he can easily get 2% of other people's votes to go his way.

Sounds like a shit bank where the guy literally cannot leave without destroying them. If he dies, and the engeratance is split up that bank is out of business. Lol

>The amount is not relevant.

Goes on a random tangent about how much some Rando person has in a bank*

nyaaaa t1_j6i44zq wrote on January 30, 2023 at 2:27 PM

One billion a few years ago, 10 billion few weeks ago.

And the one with first dips on $$

> Microsoft will reportedly get a 75% share of OpenAI’s profits until it makes back the money on its investment, after which the company would assume a 49% stake in OpenAI.

fd4e56bc1f2d5c01653c t1_j6hx92w wrote on January 30, 2023 at 1:32 PM

Point is that Microsoft is an OpenAI investor

CarlMarcks t1_j6hmebu wrote on January 30, 2023 at 11:42 AM

It’s the Spider-Man meme all over again

13metalmilitia t1_j6htupa wrote on January 30, 2023 at 1:01 PM

Bahahaha. Thank you for the chuckle!

iKnowNoBetter t1_j6fmcyy wrote on January 30, 2023 at 12:07 AM

Companies with billions invested in AI want no legal cases against AI, and thus their investments.

I'm shocked.

HeavensCriedBlood t1_j6fw2xd wrote on January 30, 2023 at 1:13 AM

Most people shouldn't be shocked by this, but it's still newsworthy nonetheless.

Malbranch t1_j6inoqu wrote on January 30, 2023 at 4:40 PM

Worse than that, this lawsuit is basically bullshit. Instead of manually going through publicly avaliable, OPEN SOURCE, code, they automated it, and taught an AI to suggest code snippets that you could with more difficulty, just research and do the exact same thing yourself.

Like, I've written a fair amount of code, I've pieced together bits of other code from open source that does what I need into code I've written. According to them, what I've done is piracy. According to open source, that's impossible, you can't pirate open source code. You publish open source knowing that the source code is free game for anyone to use, and you have no commercial claim to it, nor does anyone that uses it.

These asshats are trying to outlaw code snippets. It's idiotic.

SarahVeraVicky t1_j6j470q wrote on January 30, 2023 at 6:23 PM

> According to open source, that's impossible, you can't pirate open source code

I would assume pirating open source code would be using the code against its licensing. Yeah, I know, it's weird, but open source code in some cases (like GPL licensed code) can't just be added to a product and compiled without additional steps. If the open-source license used explicitly states you have to give the same license and rights to open source the code to other people and you commercially closed-source it, it would be an issue.

Since this removes the whole "show license before giving code", well... I could see a reason for a lawsuit being problematic to some. Who knows, most people would rather just take the code and use it, rather than deal with respecting copyrights/copyleft licenses.

Malbranch t1_j6npeyt wrote on January 31, 2023 at 5:01 PM

To my understanding though, you generally only have to do that when incorporating an application or complete piece of code like a module, function, etc? Am I off base?

divenorth t1_j6j9qq6 wrote on January 30, 2023 at 6:57 PM

There are multiple different open source licenses. What you said applies to MIT and similar licenses but not GPL. If you use any GPL 3 code all your code needs to be licensed as GPL 3. This allows devs to open source the code and still force money hungry corporations to purchase a different license.

DDoubleIntLong t1_j6g60nu wrote on January 30, 2023 at 2:23 AM

People who develop AI and actually understand the technology want the cases dropped because they're not logical, just based on public backlash due to fear of being replaced.

skychasezone t1_j6g9qwi wrote on January 30, 2023 at 2:48 AM

what's not logical

AadamAtomic t1_j6gvo7w wrote on January 30, 2023 at 5:48 AM

*Tech illiterate dummies didn't like that

[deleted] t1_j6hmteg wrote on January 30, 2023 at 11:46 AM

[removed]

MacDegger t1_j6g3jq4 wrote on January 30, 2023 at 2:06 AM

Github is owned by MS. OpenAI just wss bought into by MS for 10 billion.

So ... basically Microsoft is asking this.

gurenkagurenda t1_j6i158y wrote on January 30, 2023 at 2:04 PM

They’re the defendants in the lawsuit. They’re the only ones who can do this.

MacDegger t1_j6ngq3g wrote on January 31, 2023 at 4:07 PM

I'm just saying 'they' aren't multiple entities: 'they' are Microsoft.

AadamAtomic t1_j6gwd14 wrote on January 30, 2023 at 5:56 AM

Let's talk about META's A.i and how you don't own your Instagram photos, shal we?

What if META sells all the art Instagram photos to OpenAI for training, fair and legally?

Everyone is mad at Open A.I for nothing.

Dummies are so blinded by fear and anger they are hurting themselves in confution and attacking progress instead of how data is ethically farmed and sold by other companies.

Attacking openA.I won't help ANYONE. It won't stop Google and Facebook from selling your data.

-bickd- t1_j6i9s1v wrote on January 30, 2023 at 3:08 PM

Then you should be able to sue Meta/ Google for a fair share of when your art is used for profit. Why not?

Is it like the 'what if your Party congressman is involved in sexual assault' kinda thing? Am I supposed to vehemently defend my favourite tech company? Fuck no. Arrest them all. Enforce for all companies not paying their fair share.

[deleted] t1_j6ic0fo wrote on January 30, 2023 at 3:23 PM

[removed]

Malbranch t1_j6io7cs wrote on January 30, 2023 at 4:43 PM

Except that in this case, you're trying to claim ownership of something like a stock photo. Anyone can use the stock photo. Open source is "open" for anyone to use the "source" code.

MacDegger t1_j6ngx6b wrote on January 31, 2023 at 4:08 PM

> Anyone can use the stock photo.

No, they can't. They pay Shutterstock or Getty Images and those pay the original creators (or have paid them).

OfCourse4726 t1_j6g09lb wrote on January 30, 2023 at 1:43 AM

i feel like works produced by ai probably can not be copyrighted. this is simply because if ai can copyright, companies could simply produce endless works by ai. then eventually they've cover so much that no one else can produce anything new.

SeaweedSorcerer t1_j6g7zhr wrote on January 30, 2023 at 2:36 AM

This case asks the opposite question: can you freely use other people’s copy written content to train your AI?

ostrichpickle t1_j6gbyj0 wrote on January 30, 2023 at 3:03 AM

If A.I. can't use others copyrighted work to learn and train, why can people?

People do the same thing, learn off others and emulate other artists to learn. So does that make their art invalid to?

josefx t1_j6h1xav wrote on January 30, 2023 at 7:00 AM

At least Microsoft copilot has been caught reproducing large sections of code verbatim. Try selling a book that contains copies of Disney products and see how that turns out.

cabose7 t1_j6gme0g wrote on January 30, 2023 at 4:20 AM

Commercial software is not a person

[deleted] t1_j6gmig5 wrote on January 30, 2023 at 4:21 AM

[deleted]

cabose7 t1_j6gmwxv wrote on January 30, 2023 at 4:25 AM

OK, but this lawsuit is not targeted at random users

Valiantheart t1_j6hpfqm wrote on January 30, 2023 at 12:16 PM

Careful there. You might set off Mitt Romney's radar

Ronny_Jotten t1_j6hx2hb wrote on January 30, 2023 at 1:30 PM

> If A.I. can't use others copyrighted work to learn and train, why can people?

But it is allowed to use copyrighted works to train an AI - as long as it constitutes fair use. What's probably not fair use though, is to sell or flood the market with cheap works produced by a machine, if it negatively impacts the market for the original works it's trained on. Copyright laws make a distinction between humans and machines, because they're not the same thing. For example, works created solely by non-humans, whether a machine or a monkey, can't be copyrighted. According to the US copyright office, it requires "the nexus between the human mind and creative expression".

SeaweedSorcerer t1_j6gf8j5 wrote on January 30, 2023 at 3:24 AM

One reason is that AI training is done by copying the training data to hundreds or even thousands of training nodes. It’s near to creating a book of every painting and giving that book to every person learning art without compensating or even crediting the artists who have art in that book.

Another reason is trained AIs have inhuman memories and their models spit out the original art, in some cases near verbatim. You can look at it as compressing the data. Usually highly lossy compression but not always. And courts have shown it is clearly piracy to copy differently compressed movies/music/etc.

CallFromMargin t1_j6gmmbk wrote on January 30, 2023 at 4:22 AM

Well, that's a whole load of bullshit.

IAmDrNoLife t1_j6gxfuq wrote on January 30, 2023 at 6:08 AM

Exactly, because it's not true.

Machine Learning (or rather, Deep Learning and Neural Networks) do not "compress the data". They analyse data. They don't store any original art used in the training (otherwise, the size of these models would be in the thousands of terabytes. Instead we see them being a few gigabytes).

Furthermore, these models do not replicate the art it has been trained on. Every single piece of art generated by AI, is something entirely new. Something that has never been seen before. You can debate if it takes skill, but you can't debate that it's something new.

This video is an excellent source of information regarding this topic. It's created by a professional artist who has embraced AI generated art as a source of inspiration and to speeding up their own work.

Even furthermore, courts have indeed shown previously that Google IS allowed to data mine a bunch of data, and use this. Google has their "Google Books", which is a record of an enormous amount of books, which has been done via data mining - of course, there's a difference between the Google Books project and AI art models, due to the end result (one is a collection of existing stuff, and the other is one that can create new stuff). But the focus here was on the data mining.

One thing that a lot of people don't seem to know: You do not own a style. You cannot copyright a style. There have been a lot of artists that complain because "it's possible for people to just mimic my work". But yes, that is true, but it has always been true - simply because you do not own "your" style. People have always been able to go to another person and say "please make some art, in the style of this person". You have copyright for individual piece of art, but not the general style that you use to create said art.

Here comes my own personal opinion:

Tools using AI are the future. People are not going to lose their jobs because an AI makes them obsolete - people are going to lose their jobs if they refuse to use AI to improve their workload.

Take software development. These models can generate code from the bottom to an insane degree of detail. You no longer have to spend time on all the boring stuff, actually writing the code, you can focus on the problemsolving. The same goes for art: with AI tools, you get to skip the boring monotonous part of your workload, and you can focus on the parts that actually mean something.

CallFromMargin t1_j6gxzgp wrote on January 30, 2023 at 6:14 AM

The "they re-create art" argument comes from a paper that is widely shared on Reddit. Thing is, that paper itself mentions that the researchers trained their own models on small data sized, ranging from 300 pictures to few thousand, and they started seeing novel results at 1000 images.

Also current bots can't generate good code, not yet, but they have their own usage. As an example, a client I recently had asked me to design patching system (small shop, with 100 or so servers, they had no use for automated patching up to now), and some simple automation. You know, the type of weekend jobs you do to earn some extra cash. Well, since they are using azure, I went with azure automation, but I had no idea how it works. Well, chatGPT told me how it works, in details, gave me some code that might work, etc. But the most important thing by far was the high level overview, it saved me hours of reading documentation. This shit is the future, but not how you might expect it to be.

Ronny_Jotten t1_j6i3uog wrote on January 30, 2023 at 2:25 PM

I don't know what paper you're referring to, but there's this one:

Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models

It clearly shows, at the top of the first page, the full Stable Diffusion model, trained on billions of LAION images, replicating images that are clearly "substantially similar" copyright violations of its training data. The paper cites several other papers regarding the ability of large models to memorize their inputs.

It may be possible to tweak the generation algorithm to no longer output such similar images, but it's clear that they are still present in the trained model network.

Mr_ToDo t1_j6j481z wrote on January 30, 2023 at 6:23 PM

Well, they did both in that paper. But it would be interesting to know what the ones at the top were from. I know that there's one I saw further down in high hit percents further down but with as nice as they are I don't know why the rest don't if they belong to that model.

Ronny_Jotten t1_j6kjrlv wrote on January 30, 2023 at 11:50 PM

The paper explains what the ones at the top were from. It's using Stable Diffusion 1.4. See page 7: Case Study: Stable Diffusion, page 14: C. Stable Diffusion settings, and page 15 for the prompts and match captions. Sorry, the rest of your comment is incomprehensible to me...

Mr_ToDo t1_j6mwtay wrote on January 31, 2023 at 1:50 PM

OK that's on me. I hit the references and somehow thought I was done with the paper, I didn't think they would have the captions they used underneath that. I admit that was on my bad due diligence. Apologies

Ronny_Jotten t1_j6hpnnj wrote on January 30, 2023 at 12:19 PM

> They don't store any original art used in the training [...] these models do not replicate the art it has been trained on. Every single piece of art generated by AI, is something entirely new. Something that has never been seen before. You can debate if it takes skill, but you can't debate that it's something new

They can very easily reproduce images and text that are substantially similar to the training input, to the extent that it is clearly a copyright violation.

Image-generating AI can copy and paste from training data, raising IP concerns | TechCrunch

> courts have indeed shown previously that Google IS allowed to data mine a bunch of data [...] there's a difference [...] But the focus here was on the data mining.

In the case of the Google Books search product, the scanning of copyrighted works ("data mining") was found to be fair use. That absoutely does not mean that all data mining is fair use. Importantly, it was found that it had no economic impact on the market for the actual books, it did not replace the books. In order for the code/text/image AI generators' "data mining" of copyrighted works to be fair use, it will also have to meet that test. Otherwise, the mining is a copyright violation.

[deleted] t1_j6gku3s wrote on January 30, 2023 at 4:07 AM

[deleted]

BastardStoleMyName t1_j6iy5c3 wrote on January 30, 2023 at 5:45 PM

This is the debate of human vs computational divid at the very beginning. There are few ways to have this debate without it being philosophical.

There is not a human that is able to analyze and retain data the same way a computer can. Human memory is flawed and made efficient. When we view something, we don’t download it or literally transfer data to ourselves. Every part of the experience is an interpretation from external to internal.

As of this point a copy of an image, that would fall under copyright, has to be transferred to a system, to then be interpreted with a process that dictates how many samples to take of an object.

These systems can’t accept usage terms itself to view a file or an artwork and isn’t being brought to a gallery with the approval of the owners to view and scan the images itself. If people were paid to create images with the style of someone else, they are pulling from their brains interpretation and flawed, by nature, memory storage to interpret that.

This copyright case is honestly one of the first major stepping stones and will be a reference to how we classify AI in the future and a precedent for how we legally allow it’s use. Which is something we will have to face one day, just like every SciFi novel has warned us. But how and when that determination and at what stage we decide that is going to be important. At this stage I would say if the system cannot be legally accept the usage terms to an image, then it isn’t allowed to use those images in any manner.

From a current legal standpoint, we have currently decided that AI does not have any right to claim copyright on what it creates, and the AI creator has no right to claim the output. Then following that thought, it is not in a position to be able to use copyright covered material as the owner cannot accept the terms on the AI’s behalf and the AI cannot accept them in its own. This has been decided in reverse already.

Further it’s my opinion that AI should be restricted to single tasks and segmented. If an AI creates writing prompts, then that’s all it can do and all it can be fed with, it an AI writes code, then that’s all it should return and all it can be trained on.

For a point of future reference. It’s not about what determination gets made for AI in the long run, but how we are prepared to use an understand it now. AI created now is purely a tool for operator and consumer use.

lethal_moustache t1_j6glgvi wrote on January 30, 2023 at 4:12 AM

The art isn't invalid. It may, however, infringe copyright and make the artist subject to damages.

ostrichpickle t1_j6gm2fr wrote on January 30, 2023 at 4:17 AM

Every artist ever... learnt off other artists.. so.....

techimp t1_j6gogaa wrote on January 30, 2023 at 4:38 AM

While it may be true that new artists learn from the old, there is something intrinsically different in an homage, a cover and a new original work. 2 of those are allowed for artists without restrictions, the last (cover) has specific rules on how the copyright is handled (recording the work is one of those items the band can't do, but in theory a fan could). AI does not distinguish this. It's rough approximation of an answer often has either not enough originality or something in uncanny valley territory or weirdness.

That's what is being debated. It IS a conversation worth having since laws will always on the back foot in regards to tech, privacy and rights.

lethal_moustache t1_j6jh0s2 wrote on January 30, 2023 at 7:42 PM

An AI is not a real person nor an artist.

AuthorNathanHGreen t1_j6gg5hf wrote on January 30, 2023 at 3:31 AM

When I posted a story online for free I did so because I thought real humans could read it, and perhaps decide they wanted to buy my longer works if they liked it. I understood that someone might read it and not like it, like it but be too cheap to buy paid work, or perhaps read it and use it to study writing techniques I used. I did not however post it thinking an AI might be training itself (with no hope of me getting compensation out of the deal) so that it could further dilute the market for writing.

Don't I have a right that my content not be used in a manner I couldn't anticipate or prevent?

CallFromMargin t1_j6glixm wrote on January 30, 2023 at 4:13 AM

In that specific case, no. Fair use laws cover that, and Google vs author guild had solved that specific case in court. Using your work falls under fair use, just like human reading your work and incorporating ideas in his/her own work.

That said, if you wrote shit in internet, let me assure you, it is almost useless for training writing AI. Believe me, I tried to do it on dataset of /r/writingprompts, the thing is that most writing there just sucks, which is not bad, as the only way of learning to write is by writing, thus putting bad work on the internet. It doesn't change the fact that it objectively sucks.

If I wanted to write an actual writing AI I would use a collection of classical works, works that stood the test of time, and frankly, the difference between those and what is put on internet is often in how scenes and characters are flushed out.

Ronny_Jotten t1_j6hrjni wrote on January 30, 2023 at 12:38 PM

> In that specific case, no. Fair use laws cover that, and Google vs author guild had solved that specific case in court. Using your work falls under fair use, just like human reading your work and incorporating ideas in his/her own work.

That's completely false. The Google case was found to be fair use, precisely because it did not "dilute the market for writing". That's one of the four legal tests for fair use. The judge said that it did not produce anything that competed economicially in the market for the books that it scanned; on the contrary it might increase their sales. Whether such scanning is fair use, is determined on a case-by-case basis. If AIs are being used to produce "new" works that are sold commercially and undercut the authors of the originals that it's based on, it will be much more difficult to prove fair use.

Furthermore, the Copilot product creates a loophole where businesses can incorporate code released under e.g. a GPL license that requires said business to release its deriviative works under the same open-source license, and make it closed-source instead. That can also create an unfair economic advantage in the market. These questions are far from "solved".

Doingitwronf t1_j6gpxd7 wrote on January 30, 2023 at 4:51 AM

I wonder what happens now that Ais can be instructed to produce works in the specific style of any author/artist who's works were supplied to the training set?

CallFromMargin t1_j6gwqup wrote on January 30, 2023 at 6:00 AM

What used to happen when you asked for a painting in style of X? The same thing is happening with AI art. It's literally the same thing.

Ronny_Jotten t1_j6hspu6 wrote on January 30, 2023 at 12:50 PM

It's literally not the same thing though, at least legally speaking. It's already accepted that a human looking at an artwork is not "making a copy", as defined in the copyright laws. As long as they don't produce a "substantially similar" work, there's no copyright violation. The same can't be said for scanning or digitally copying a work into a computer; that is "making a copy" that's covered by the copyright laws. In some cases, that can come under the "fair use" exemption. But not in all cases. It's evaluated on a case-by-case basis; in the US according to the four-part fair use test. For example, if it's found that the generated works have a negative economic impact on the value of the original works, there's a substantial chance that it won't be found to be fair use.

CallFromMargin t1_j6hvui0 wrote on January 30, 2023 at 1:20 PM

The computer is not storing a copy of original work in trained model. It looks at picture, it learns stuff from it and it stores only what it learns.

Your argument is based either on fundamental misconception on your part, or a flat out lie from you. Neither one casts you in good light

Ronny_Jotten t1_j6hzcpu wrote on January 30, 2023 at 1:50 PM

> The computer is not storing a copy of original work in trained model. It looks at picture, it learns stuff from it and it stores only what it learns.

Just because you anthropomorphize the computer as "looking at" and "learning stuff", doesn't mean it's not digitally copying and storing enough of the original work in a highly compressed form within the neural network to violate copyright by producing something "substantially similar": Image-generating AI can copy and paste from training data, raising IP concerns | TechCrunch

But regardless of whether it produces a "substantially similar" work as output, making a copy of the original copyrighted work into the computer in the first place is a required step in training the AI network. Doing so is only legally allowed if it's fair use. That was the question in the Google books case - it was found that the scanning of books was fair use, because Google didn't use it to create new books or otherwise economically damage the authors or the market for the original books. But that's not necessarily the case with all instances of making digital copies of copyrighted works.

> Your argument is based either on fundamental misconception on your part, or a flat out lie from you. Neither one casts you in good light

Well, you can fuck off with that, dude. There's no call for that kind of personal attack.

CallFromMargin t1_j6i4o2a wrote on January 30, 2023 at 2:31 PM

No, the fact that it's mathematically impossible to store that many images, and if done, this compression algorithm would violate laws of physics, means that it is not storing images.

It is impossible to compress 380tb of data to 0.04tb of data.

Ronny_Jotten t1_j6i68gn wrote on January 30, 2023 at 2:43 PM

And yet, the citation I gave shows Stable Diffusion obviously replicating copyrighted images from the LAION training set, despite your musings about thermodynamics. It may not store reproducible representations of all images, I don't know - but it unquestionably does store some.

In any case, it doesn't change the fact that copying images into the computer in the first place, in order to train the model, would need to come under a fair use exemption. For example, research generally does - but not in every case, especially if it causes economic damage to the original authors. In many countries, authors also have moral rights, to attribution, to preservation of the integrity of their work against alteration that damages their reputation, etc., which may come into play.

[deleted] t1_j6ibove wrote on January 30, 2023 at 3:21 PM

[deleted]

SerenumUS t1_j6gwfjj wrote on January 30, 2023 at 5:56 AM

It doesn't matter. The work "produced" by AI is LITERALLY the result of using other people's content without consent. Therefore, all produced work is quite literally stolen portions of other people's work.

So yes, it should be considered copyright infringement because it's literally taking people's work and using it without their consent. I highly suggest you look up how machine learning works.

Therefore the "produced" content of the AI is not original or genuine, and is very limited by context. Not to mention, it is not able to produce a single, genuine piece of software that is hundreds of lines of code without it being a direct copy and paste of someone else's work.

poo2thegeek t1_j6h8w9u wrote on January 30, 2023 at 8:33 AM

I mean, yes an AI model learnt from other people’s examples, but is that also not what humans do?

Hmm_would_bang t1_j6i10j8 wrote on January 30, 2023 at 2:03 PM

Humans get inspired by their own perception and imperfect memories of other artists and experiences in their life, AI models literally take the art and add it to their model.

Regardless, you seem to be proposing we treat AI models as if they are human beings and not products. We aren’t going to do that. It’s a nice philosophical game maybe, but if you just look at the facts of the matter you’re dealing with a case of a company taking unlicensed artwork and adding it into their product.

poo2thegeek t1_j6i5xxj wrote on January 30, 2023 at 2:41 PM

AI models take the art, and add it to their training inputs.

It doesn't have perfect memory of the inputs - this can be demonstrated by the fact that model sizes are significantly smaller than the size of data used to train them. Similarly, 'own perception' is an interesting idea. What does it actually mean? I'd argue than in an ML model, utilising some random input when training, to allow for different outputs for the same input (e.g, how chat GPT can reply differently even if you ask it the exact same thing on two different occasions).

I'm not saying we should treat AI models as if they're human beings - I don't think an AI model should be able to hold a copyright for example, but the company thats trained that model should be able to.

Similarly, if the AI model were to output something VERY similar to some existing work, then I think that the company that owns said AI model should be taken to court.

oscarhocklee t1_j6i3ohl wrote on January 30, 2023 at 2:24 PM

See, that's the thing. When humans copy work, we have laws that step in and allow the owner of the work to say "No, you can't do that". Humans could copy anything they see, but there are legal consequences if they copy the wrong thing - especially if they gain financially by doing so. This is very much an argument about whether what these tools are doing is sufficiently like what a human could do for the laws that apply to humans to apply.

If copilot for instance generates code that (were a human to write it) would be legally considered (likely after a long and damaging lawsuit) to be a derived work of something licensed under the GPL, then that derived work must also legally be licensed undrr the GPL.

What's more, there is no clear authorial provenance. Say you find a github repo that contains what looks like a near-perfect copy of some code you own and which you released under a license of your choice. If a human wrote it, that's a legal issue.

Fundamentally, we're arguing here if it's okay in a situation like this to say "Oh, no, it's legal because software did it for me". And remember, there's no way to prove how much of a text file was written by a human and how much by software once it's saved.

poo2thegeek t1_j6i59po wrote on January 30, 2023 at 2:36 PM

So, while this is certainly true, for something to come under copy right it had to be pretty similar to whatever its copying.

For example, if I want to write a book about wizards in the UK fighting some big bad guy, that doesn't mean I'm infringing on the copy right of Harry Potter.

Similarly, I can write a pop song that discusses, idk, how much I like girls with big asses, and that doesn't infringe on the copyright of the (hundreds) of songs on the same topic.

Now, I do think that if an AI model output something that was too similar to some of its training material, and the company that owned that said AI went ahead and published it, then yeah the company should be sued for copyright infringement.

But, it is certainly possible for AI to output completely new things. Just look at the AI art that has been generated in recent month - it's certainly making new images based off what its learnt a good image should look like.

Also, on top of all this, its perfectly possible to ensure (or at lest, massively decrease probability of) outputting something similar to its inputs, by 'punishing' the model if it ever outputs something too similar to training inputs.

All this means that I don't think this issue is anywhere near as clear cut as a lot of the internet makes it out to be.

SerenumUS t1_j6ipa88 wrote on January 30, 2023 at 4:50 PM

The AI model, presumably machine learning, is not even remotely close to being "like a human". It's called "artificial intelligence" for a reason. The "training" data heavily influences the output.

If you made a machine learning model on a very small scale, such as putting 10 images from artists as its training data, the produced work would very obviously be just portions of the images you fed it. This is no different than what we are seeing now, just on a bigger scale. The output for the source code generation, or art generation, is quite literally using stolen portions of code/images.

I feel people are looking at the final outcome rather than how it got there.

This is the equivalent of hiring one guy to just copy and paste code from the internet for every feature, etc. for a piece of software to be developed (with a lot of imperfections, mind you) and giving the guy a raise because the outcome works.

poo2thegeek t1_j6iq0d4 wrote on January 30, 2023 at 4:55 PM

Yes, but if you took a 4 year old child who had never seen a painting before, showed them 10 paintings, and then asked them to make their own painting. Either, they’ll just scribble on the canvas randomly because they’re not competent enough to do anything, or they’ll end up making something very similar and nearly identical to those examples you’ve shown them.

You use the example of the programmer taking code off the internet… I’m not sure if you’re a programmer yourself, but you know that’s a meme right? The joke is that a big part of programming is finding the right stack overflow/blog/tutorial that has the code similar enough to what you need, and you change bits of it and incorporate it into your work.

SerenumUS t1_j6k9yvq wrote on January 30, 2023 at 10:43 PM

Comparing a child painting stuff to an AI model stealing artwork without permission for others to use to generate art is apples and oranges. You still aren't addressing the blatantly obvious point - artwork on the internet being used without permission. People are selling or using these generated AI works (by themselves or apart of a book, etc.). This causes issues.

And I am a Software Engineer - yes I know it's a meme but I'm not referring to that. Good programmers don't copy and paste from the internet constantly. If it's an algorithm, sure that is fine. But a good programmer can generally develop features on the frontend/backend for software without needing heavy assistance.

poo2thegeek t1_j6lr85c wrote on January 31, 2023 at 5:34 AM

Again, you keep bringing up the same point - “art work being used without permission” - and I keep arguing that this is no different to a person looking at a piece of art as inspiration.

It’s perhaps more of a philosophical issue, and it also relates to my personal belief that DL models are closer to analogous to the brain than a lot of people imagine - but this is purely conjecture.

Ronny_Jotten t1_j6hujf1 wrote on January 30, 2023 at 1:08 PM

> i feel like works produced by ai probably can not be copyrighted.

The US has already said it won't grant copyright to machine-produced works, because they lack the required creativity: The US Copyright Office says an AI can’t copyright its art - The Verge

dwild t1_j6i5e9i wrote on January 30, 2023 at 2:37 PM

This is not about whether the result can be copyrighted but whether the result keep the copyright of what it learned.

In the case of Stable Diffusion it will be harder to fight, but Github Copilot had made verbatim copy of code multiple time, so that’s a pretty more clear case.

MeatisOmalley t1_j6hggse wrote on January 30, 2023 at 10:22 AM

An entirely new paradigm for copyright is long overdue

DDoubleIntLong t1_j6g6899 wrote on January 30, 2023 at 2:25 AM

As a computer programmer, my work developing AI to create art for myself is a creative process, the AI algorithm I programmed is my creation, and the art it produces is my creation. I am the one creating it, therefore it is my creation, thus, should be entitled to its own copyright.

OfCourse4726 t1_j6g9xj6 wrote on January 30, 2023 at 2:49 AM

yes except the copyright system was created without ai in mind. ai would break the system. like i said, companies could with a super computer generate so much art that it would just flood the system with copyrights. a human artist could end up accidentally infringing all the time and thereby freezing their capacity to create original works. then there are also issues with likeness. if ai creates a human face, do real humans with that face get a royalty? how much alike to that face would it need to be? how come celebrity lookalikes don't get royalties? some of them are 99% alike.

Solid_Rice t1_j6gangv wrote on January 30, 2023 at 2:54 AM

is the art that it produces based on the art that you created?

dpsoma t1_j6gcefe wrote on January 30, 2023 at 3:05 AM

That entirely misses the point here though. Say i'm writing a paper and need to back up a statement with math. The equations I derive my equations from were published in someone else's work, and I used them. I did all the math, drew my conclusions, and wrote the paper. Does the person who developed the equations that everything I did is based on deserve credit, since I didn't use their equations explicitly?

Or better yet, in your example, I take your AI code and feed it straight into and AI framework to optimize it. After 24 hours, it has made minor improvements. I market both the AI that optimized the code, as well as the code itself as a "product", without providing you credit, or in this case, profits from the copyright that I place on the work.

Unless you also generate 100% of the training set yourself, credit must be granted to those that you used the work of. It's quite honestly mind-boggling that after decades of DMCA in commercial ventures and citation policies in academia that this isn't the conclusion that everyone comes to. (I do not necessarily endorse the causes above whole-heartedly, especially DMCA. However, trying to pretend them away is silly, and should be treated as such)

MammothPhilosophy192 t1_j6hnow1 wrote on January 30, 2023 at 11:57 AM

Did you created the dataset it was trained on?

BeerInTheRear t1_j6et1hn wrote on January 29, 2023 at 9:07 PM

Maybe AI can represent itself?

If not, Robot Bob Dylan is always available...

https://youtu.be/FjCRpbHmIQ0

Remarkable_Flow_4779 t1_j6gjvct wrote on January 30, 2023 at 3:59 AM

The MS way… IP only for me and not for thee. Trash company.

Due_Cauliflower_9669 t1_j6gs5xa wrote on January 30, 2023 at 5:13 AM

These tools seem to be using others’ content for more than just training. There is growing anecdotal evidence that the content the AI creates contains recognizable segments/samples of other creators’ work. As in, chatbots recite entire sentences and paragraphs of other work in their answers, and image generators replicate parts of known images from other artists in their output. Not sure that qualifies as fair use, especially if OpenAI seeks to profit off the content its technology generates.

gurenkagurenda t1_j6hkkbh wrote on January 30, 2023 at 11:19 AM

They only use it for training. Memorization is just a well known side effect of generative models. It’s not something anyone wants to happen; it’s just hard to prevent in every case.

Head-Mathematician53 t1_j6hg5rt wrote on January 30, 2023 at 10:18 AM

Have ChatGPT be the court and determine if AI copyright law suits can be thrown out.

partaloski t1_j6ixlso wrote on January 30, 2023 at 5:42 PM

Microsoft (Microsoft), GitHub (Microsoft), and OpenAI (49% Microsoft) ask court to throw out AI copyright lawsuit

wub2wubz t1_j6gylck wrote on January 30, 2023 at 6:21 AM

I think it’s important to get a precedent set for this sort of thing. It’s images today but ai generation will bleed into many different subjects. Imagine feeding ai a bunch of taylor swift albums and having it produce music thats similar, would that be subject to copyright laws? What about video games and movies? Im sure disney wouldn’t like their movies being used for ai learning. Artists and creators need some sort of protection.

gurenkagurenda t1_j6i1clc wrote on January 30, 2023 at 2:06 PM

Won’t someone please think of Disney and the recording industry.

[deleted] t1_j6iu1ai wrote on January 30, 2023 at 5:20 PM

[removed]

jeffreyshran t1_j6j5xus wrote on January 30, 2023 at 6:34 PM

Cue the Pam "they're the same thing" meme from the office.

Nerdenator t1_j6jpa20 wrote on January 30, 2023 at 8:34 PM

Well, if they're deriving revenue from AI output that is the result of training the AI with copyrighted source material, I'd say they owe the rights holder some cash.

Purple_CASH t1_j6jt65n wrote on January 30, 2023 at 8:57 PM

2 good videos explaining things and why this matters for setting precedent for future AI projects.

Shorter video:

Lawyer Explains Stable Diffusion Lawsuit (Major Implications!)

Follow up longer video:

New Lawsuits Threaten A.i. Art (Could be Major!) | Corridor Cast EP#163

[deleted] t1_j6k0zc0 wrote on January 30, 2023 at 9:46 PM

[removed]

AutoModerator t1_j6k264y wrote on January 30, 2023 at 9:53 PM

Thank you for your submission, but due to the high volume of spam coming from Medium.com and similar self-publishing sites, /r/Technology has opted to filter all of those posts pending mod approval. You may [message the moderators](/message/compose?to=/r/technology&subject=Request for post review) to request a review/approval provided you are not the author or are not associated at all with the submission. Thank you for understanding.

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

[deleted] t1_j6f4i2f wrote on January 29, 2023 at 10:15 PM

[deleted]

lethal_moustache t1_j6f9qp1 wrote on January 29, 2023 at 10:47 PM

The issue here is whether the output of Copilot is a derivative work which would be subject to preexisting copyrights. On the proprietary side of things, a case can be made for damages, but the damages would be split up into microsized little portions. Any one copyright holder won't have been harmed much, but the harm still exists. What is more, copyright holders who have registered their copyrights may make a case for statutory damages. It won't take too many instances of statutory damages being found to make this very expensive for the defendants. Finally, ownership of Copilot output may accrue to the plaintiffs based on derivative rights.

On the open source side of things, any open source software used as training fodder for Copilot would make all output of the Copilot system open source, if the stickier GPL were used originally that is. This license would also, in many cases, require notice and publication of the Copilot output.

That the training data gets output based on some prompt is a very nice way to prove copyright infringement. Ironically, the same kind of software used to identify piracy on sites like YouTube would be very helpful in finding copyright violations in the output of a system like Copilot.

vgf89 t1_j6gfcnz wrote on January 30, 2023 at 3:25 AM

The problem, both for this and image generation, is going to come down to Fair Use.

"the purpose and character of your use"

"the nature of the copyrighted work"

"the amount and substantiality of the portion taken"

"the effect of the use upon the potential market."

I'm fairly certain that every one of these 4 points can be argued in favor of generative AI's. At a minimum, these systems are extremely transformative, augment the capabilities of the user and organizations as a whole, have huge possible ways they can be used, and will spawn more quality content in larger projects.

At the same time, it will take and transform jobs beyond recognition, especially for art. You want concept art, and you want to iterate on it to get a feel before committing to larger, hand drawn professional pieces? Don't wait, prompt and iterate in the meeting room itself! Need thousands of textures to make every little thing in your game unique yet similar in style, more content than any number of artists you hire could reasonably create? Generate them. It'll replace work some artists do while massively expanding possibilities as a whole.

Current programming AIs are far less powerful in that regard, but are still good timesavers. If you need to rewrite some functions to make them simpler and fix bugs, or you have your API and relationships figured out and know exactly how to do it but want to save time writing it all out, being able to get the AI to write your for loops, filters, regexes, call the functions you need, etc all by typing a few comments in plain English saves a lot of time that's better spent on verification, debugging, and architecture. ChatGPT can also be a good way to begin new projects, though here there be dragons, it really likes to hallucinate imaginary API.

lethal_moustache t1_j6gl6lg wrote on January 30, 2023 at 4:10 AM

I might find your argument more persuasive if generative AI's were real persons. They are tools created and used by for-profit organizations for profit generating purposes.

[deleted] t1_j6gq9gt wrote on January 30, 2023 at 4:54 AM

[deleted]

Ronny_Jotten t1_j6htzkr wrote on January 30, 2023 at 1:02 PM

> I'm fairly certain that every one of these 4 points can be argued in favor of generative AI's.

Ok, but you haven't actually done that. You only argue that it makes things more convenient and cheap for the users, who no longer have to hire the actual programmers or artists whose work it samples and undercuts. That's exactly the thing that could cause it to fail the fourth rule for fair use.

Ok-Quail-733 t1_j6h1iq8 wrote on January 30, 2023 at 6:55 AM

Ne yapacaksın ama onu bana söyle o fotoğrafı bana çabuk göster o fotoğrafı Berat da ikimiz kafamız kafamı kıracağım en güçlü yaratığı yap kendini oraya sokmanı istiyorum sizi bizdeki her şeyini içinden geç ve

Comments

Microsoft (Microsoft), GitHub (Microsoft), and OpenAI (49% Microsoft) ask court to throw out AI copyright lawsuit