Lemme is very pro-piracy so that’s kind of a silly statement. It’s also worth noting that AI is clearly transformative. Collage is literally legal, how could AI be stealing?
The problem is that it’s making the field hyper competitive by “stealing” jobs, but photoshop and photography did this as well in their time.
No one cried about translators losing their niche because of Google since just like generative AI, it benefits society as a whole in the end.
There’s a bit of a difference, I’d say. Piracy hurts massive companies that already have tons of money to spare and (to be frank) don’t need any more. AI hurts individual artists that barely make a living as is. It’s like comparing Robin Hood to whatever the inverse of Robin Hood is (OpenAI, I guess). Point is, I have zero issue with generative AI, I do however have issue with the companies behind it. If all of their data was sourced ethically, and the people creating the training data actually got compensation, I’d be fine with it. Everything can be a tool for high effort and low effort content, it’s just increasingly insulting to creators that their work is being stolen and then twisted into something with considerably less effort that makes more money than they could ever hope to make. In other words, dead internet theory.
I mostly agree with what you are saying but I do think sourcing it ethically is a pipe dream.
It’s impossible to get all that data from individuals, it’s way too complicated. What’s already happening is the websites are selling the data and they all have it in their terms of service that they can, even Cara the supposedly pro artist website.
The individuals are not getting compensated and all regulations proposed are aimed at making this the only option. If companies have to pay for all that data while Google and Microsoft are paying premiums to have exclusive access, the open source scene dies overnight.
It really seems to me like there’s a media campaign being run to poison the general populations sentiment so AI companies can turn to the government and say “see, we want regulations, the public wants regulations, it’s a win win”. It’s regulatory capture.
I’m also pro piracy and use it myself for all my media. I still consider it theft even if moral but I understand your point about it stealing from artist. I just don’t think any current regulation will help artists. Personally, I advocate for copy left licenses for anything that uses public data but I sadly have never scene anyone proposed law or government document mention it.
It’s not hard to figure out, it’s just not economically viable to set up a system for it when the alternative is just not worrying about ethics and doing it anyway. We struggle to get companies to pay slightly more for recycled plastic than virgin plastic, this isn’t any different.
You do that by banning or disincentivising the less ethical option, the moment it’s less economically viable, they’ll pivot, unless it isn’t an option.
You are being manipulated as to think giving all the power to big data and big AI companies while squashing open source is in your best interest.
Don’t do it at all isn’t an option. Doing it “ethically” means websites like Getty, Deviant art, Adobe getting a fat payday while giving our whole economy to Google and Microsoft. There’s potential serious job loss coming our way, and in your perfect world, all of those jobs lost would go straight into OpenAis or Googles pocket as a subscription service since any other option wouldn’t be afford to build a model.
It is regulatory capture.
Please actually try to understand my points instead of knee jerk reacting all over the place because of their media campaign. OpenAI wants regulations, anthropic got caught literally sending a letter to California telling them they approve the new bills.
I’m being pragmatic, I know any regulation is just meant to build a moat and kill open source, I know the artists are never going to get paid either way. I’d rather not have 2-3 subscription services be our only option and kill open source for what amount to literally no gain for individuals.
Reddit got paid 60 mil for their data, I posted a shitload of content back in the day and still haven’t gotten a dime. I’m sure companies like Getty will do the right thing though, right?
I’m sorry if I’m being harsh but you are being a mouthpiece for the people you hate.
Are you done putting words in my mouth? Where did I say anything from the arguments you’re fighting against? I couldn’t give less of a shit what open ai wants, I’m not fighting for open ai, I’m fighting for all the artists who’ve been told again and again copyright infringement against big corpos is a no-no but now we have companies doing the same thing to them and it’s treated as an inevitability. For all I care open ai should be investigated for profiting from data they acquired through the loophole of being non-profit.
What do any of the concerns over the way data acquisition happens have to do with open source? Open source the software, acquire the data ethically. Prosecute anyone using datasets with unlawfully acquired data to the same extent you’d prosecute copyright infringement because that’s what it is. No middle ground. There’s a shit ton of data in the public domain, use that instead of scouring artstation and written books from living writers. Is it not easy to sort or of less quality? Boohoo. If you want better data pay the artists and writers.
Instead of this doomerposting “we’re gonna get the short stick either way might as well get something fun out of it” is exactly why we’re having our livelihoods trampled over.
I couldn’t give less of a shit what open ai wants, I’m not fighting for open ai, I’m fighting for all the artists
What you want and what openai want are the same thing. Regulations directly benefit them by giving them and Google a easy peasy monopoly. Artists are never getting a dime out of any of this, all the data is already owned by websites and data brokers.
open ai should be investigated for profiting from data they acquired through the loophole of being non-profit.
This is patently false, there isn’t a loop hole. Almost all ml projects use public facing data, it’s accepted and completely legal since it’s highly transformative. What do you think translation software or Shazam uses? You probably already use AI multiple times a week. I’m guessing you didn’t get mad when all the translators lost their job a decade ago.
What do any of the concerns over the way data acquisition happens have to do with open source?
How can a company actually open source anything if the costs are so insanely high. It’s already above a million in compute power for a foundation model, how many open source projects do you expect if reddit or getty gets to tack on an other 60 million. Even worse, Microsoft and Google will absolutely pay a premium to keep it out of the hands of their competition. And no, there is simply not enough data in the public domain and most of it shit tbh.
You are missing the forest for the tree and this is by design. There’s a reason you are bombarded every day by ai bad articles, it’s to keep you mad about it so you don’t actually think about what these regulations mean.
Again, regulation doesn’t imply current giants get to still reap the rewards of that training data. Look at how GDPR affected data storage and acquisition retroactively. Assuming only one is possible is a false narrative.
Public facing doesn’t mean open source. We’ve had this discussion before on GitHub accessible source code. Just because it’s available to peruse doesn’t mean one is allowed to process that image and create derivates based on its data. Weird thing to point out about translation, do you have any idea who I am or are you just regurgitating talking points? How do you know whether I was/am offended by translators being replaced or not?
I’m confused about the open source bit, what costs? I feel like you’re not explaining a key connection in your argument. If the barrier to development overall is acquiring data ethically saying that is a stance against open source is misleading, as it’s against any kind of such development not just the open source kind. We have museums and library full of public domain works, it most definitely is enough, it’s just not as commercially appealing as modern works, so if given the choice of course companies will choose the path that gives them more rewards especially when we don’t punish them for copyright infringement when they do.
You make it sound like LLMs are the best thing since sliced bread and should be pursued at all costs no matter how much it steps on the little guys in the process, but my question is why? We live in a world plagued by costs of living, atrocities, and other fixable things, sure this advanced text and image prediction stuff is a fun toy but will it actually improve the quality of life of people? Artists and writers already struggle more than your usual workers to get good pay for their time, this stuff might be sometimes touted as democratising art or something but it’s clearly not the main outcome from putting this kind of tool out in a world where capitalising on your skills is what gives people a roof over their heads. In such a world it’s only worsening peoples quality of life in exchange for a bit of fun and some performance improvements at work.
And please don’t call me “mad”, don’t imply I’m clouded by emotions when I’m most surely providing clear statements. Throughout this I’ve been arguing against your points, but you’ve been arguing against a made up persona that you’ve attributed to me too. Go argue with those people and when you’re ready to engage me then argue against my points.
I agree with JustARaccoon’s reply to your comment, and also this is really turning from a respectful debate into a ridiculous argument for something most everyone thinks is wrong. The artists should get their compensation. I don’t care how “improbable” it is, it needs to happen.
I’ll be the first to praise a bill that is actually aimed at helping artist. I’m just being realistic, everything being proposed is catered towards data brokers and the big AI players. If the choice is between artist getting screwed, and artists and society getting screwed, I will choose the former.
I understand it needs to happen but doing the opposite and playing into openAIs hand doesn’t really help imo.
It’s very weird for a community that’s generally tech savvy. I think there’s a lot of manipulation going on. I don’t think it’s a coincidence that almost only anti-AI articles get posted but I’m also against baseless accusations so I mostly shut up about it.
People do tend to hate theft, yeah.
Lemme is very pro-piracy so that’s kind of a silly statement. It’s also worth noting that AI is clearly transformative. Collage is literally legal, how could AI be stealing?
The problem is that it’s making the field hyper competitive by “stealing” jobs, but photoshop and photography did this as well in their time.
No one cried about translators losing their niche because of Google since just like generative AI, it benefits society as a whole in the end.
There’s a bit of a difference, I’d say. Piracy hurts massive companies that already have tons of money to spare and (to be frank) don’t need any more. AI hurts individual artists that barely make a living as is. It’s like comparing Robin Hood to whatever the inverse of Robin Hood is (OpenAI, I guess). Point is, I have zero issue with generative AI, I do however have issue with the companies behind it. If all of their data was sourced ethically, and the people creating the training data actually got compensation, I’d be fine with it. Everything can be a tool for high effort and low effort content, it’s just increasingly insulting to creators that their work is being stolen and then twisted into something with considerably less effort that makes more money than they could ever hope to make. In other words, dead internet theory.
I mostly agree with what you are saying but I do think sourcing it ethically is a pipe dream.
It’s impossible to get all that data from individuals, it’s way too complicated. What’s already happening is the websites are selling the data and they all have it in their terms of service that they can, even Cara the supposedly pro artist website.
The individuals are not getting compensated and all regulations proposed are aimed at making this the only option. If companies have to pay for all that data while Google and Microsoft are paying premiums to have exclusive access, the open source scene dies overnight.
It really seems to me like there’s a media campaign being run to poison the general populations sentiment so AI companies can turn to the government and say “see, we want regulations, the public wants regulations, it’s a win win”. It’s regulatory capture.
I’m also pro piracy and use it myself for all my media. I still consider it theft even if moral but I understand your point about it stealing from artist. I just don’t think any current regulation will help artists. Personally, I advocate for copy left licenses for anything that uses public data but I sadly have never scene anyone proposed law or government document mention it.
I also agree that ethical sourcing is pretty ridiculous given real world constraints, but I’m holding out hope that someone figures it out.
It’s not hard to figure out, it’s just not economically viable to set up a system for it when the alternative is just not worrying about ethics and doing it anyway. We struggle to get companies to pay slightly more for recycled plastic than virgin plastic, this isn’t any different.
By “figure it out” I meant “figure out a way to get big companies on board”
You do that by banning or disincentivising the less ethical option, the moment it’s less economically viable, they’ll pivot, unless it isn’t an option.
The problem being, how do we get it banned?
“it’s too hard to respect copyright of all the little guys so we’ll just not” is an insane take. If you can’t do it ethically don’t do it at all.
You are being manipulated as to think giving all the power to big data and big AI companies while squashing open source is in your best interest.
Don’t do it at all isn’t an option. Doing it “ethically” means websites like Getty, Deviant art, Adobe getting a fat payday while giving our whole economy to Google and Microsoft. There’s potential serious job loss coming our way, and in your perfect world, all of those jobs lost would go straight into OpenAis or Googles pocket as a subscription service since any other option wouldn’t be afford to build a model.
It is regulatory capture.
Please actually try to understand my points instead of knee jerk reacting all over the place because of their media campaign. OpenAI wants regulations, anthropic got caught literally sending a letter to California telling them they approve the new bills.
I’m being pragmatic, I know any regulation is just meant to build a moat and kill open source, I know the artists are never going to get paid either way. I’d rather not have 2-3 subscription services be our only option and kill open source for what amount to literally no gain for individuals.
Reddit got paid 60 mil for their data, I posted a shitload of content back in the day and still haven’t gotten a dime. I’m sure companies like Getty will do the right thing though, right?
I’m sorry if I’m being harsh but you are being a mouthpiece for the people you hate.
Are you done putting words in my mouth? Where did I say anything from the arguments you’re fighting against? I couldn’t give less of a shit what open ai wants, I’m not fighting for open ai, I’m fighting for all the artists who’ve been told again and again copyright infringement against big corpos is a no-no but now we have companies doing the same thing to them and it’s treated as an inevitability. For all I care open ai should be investigated for profiting from data they acquired through the loophole of being non-profit.
What do any of the concerns over the way data acquisition happens have to do with open source? Open source the software, acquire the data ethically. Prosecute anyone using datasets with unlawfully acquired data to the same extent you’d prosecute copyright infringement because that’s what it is. No middle ground. There’s a shit ton of data in the public domain, use that instead of scouring artstation and written books from living writers. Is it not easy to sort or of less quality? Boohoo. If you want better data pay the artists and writers.
Instead of this doomerposting “we’re gonna get the short stick either way might as well get something fun out of it” is exactly why we’re having our livelihoods trampled over.
What you want and what openai want are the same thing. Regulations directly benefit them by giving them and Google a easy peasy monopoly. Artists are never getting a dime out of any of this, all the data is already owned by websites and data brokers.
This is patently false, there isn’t a loop hole. Almost all ml projects use public facing data, it’s accepted and completely legal since it’s highly transformative. What do you think translation software or Shazam uses? You probably already use AI multiple times a week. I’m guessing you didn’t get mad when all the translators lost their job a decade ago.
How can a company actually open source anything if the costs are so insanely high. It’s already above a million in compute power for a foundation model, how many open source projects do you expect if reddit or getty gets to tack on an other 60 million. Even worse, Microsoft and Google will absolutely pay a premium to keep it out of the hands of their competition. And no, there is simply not enough data in the public domain and most of it shit tbh.
You are missing the forest for the tree and this is by design. There’s a reason you are bombarded every day by ai bad articles, it’s to keep you mad about it so you don’t actually think about what these regulations mean.
Again, regulation doesn’t imply current giants get to still reap the rewards of that training data. Look at how GDPR affected data storage and acquisition retroactively. Assuming only one is possible is a false narrative.
Public facing doesn’t mean open source. We’ve had this discussion before on GitHub accessible source code. Just because it’s available to peruse doesn’t mean one is allowed to process that image and create derivates based on its data. Weird thing to point out about translation, do you have any idea who I am or are you just regurgitating talking points? How do you know whether I was/am offended by translators being replaced or not?
I’m confused about the open source bit, what costs? I feel like you’re not explaining a key connection in your argument. If the barrier to development overall is acquiring data ethically saying that is a stance against open source is misleading, as it’s against any kind of such development not just the open source kind. We have museums and library full of public domain works, it most definitely is enough, it’s just not as commercially appealing as modern works, so if given the choice of course companies will choose the path that gives them more rewards especially when we don’t punish them for copyright infringement when they do.
You make it sound like LLMs are the best thing since sliced bread and should be pursued at all costs no matter how much it steps on the little guys in the process, but my question is why? We live in a world plagued by costs of living, atrocities, and other fixable things, sure this advanced text and image prediction stuff is a fun toy but will it actually improve the quality of life of people? Artists and writers already struggle more than your usual workers to get good pay for their time, this stuff might be sometimes touted as democratising art or something but it’s clearly not the main outcome from putting this kind of tool out in a world where capitalising on your skills is what gives people a roof over their heads. In such a world it’s only worsening peoples quality of life in exchange for a bit of fun and some performance improvements at work.
And please don’t call me “mad”, don’t imply I’m clouded by emotions when I’m most surely providing clear statements. Throughout this I’ve been arguing against your points, but you’ve been arguing against a made up persona that you’ve attributed to me too. Go argue with those people and when you’re ready to engage me then argue against my points.
I agree with JustARaccoon’s reply to your comment, and also this is really turning from a respectful debate into a ridiculous argument for something most everyone thinks is wrong. The artists should get their compensation. I don’t care how “improbable” it is, it needs to happen.
I’ll be the first to praise a bill that is actually aimed at helping artist. I’m just being realistic, everything being proposed is catered towards data brokers and the big AI players. If the choice is between artist getting screwed, and artists and society getting screwed, I will choose the former.
I understand it needs to happen but doing the opposite and playing into openAIs hand doesn’t really help imo.
Like I said, they just hate AI here. It’s pretty amusing
It’s very weird for a community that’s generally tech savvy. I think there’s a lot of manipulation going on. I don’t think it’s a coincidence that almost only anti-AI articles get posted but I’m also against baseless accusations so I mostly shut up about it.