Copyrighted content on the internet isn't a free for all for people to train models with. No matter the good intentions, if Stability AI didn't take reasonable steps to remove copyrighted data from the training set then IMO they have an uphill battle to prove the case to a jury (if it gets there).
Anyone can rent compute and as AI advances it's only going to get easier to fine tune models.
There's going to be a world where you can just point a program, one on your computer at a website with a bunch of images on it, wait 10 minutes, and I have it punching out stable diffusion style images.
What are the courts going to do to stop it? Sneak into your home and record everything you do? How are you going to prove an AI was trained on certain images? How are you going to prove an AI generated an image?
The cat is out of the bag, even if all the courts have decide that these images are copyright and can't be used, they're going to continue to be used by people, all over the place, with absolutely nothing being able to stop it.
The era of needing an artist to produce a novel image for a purpose is over. It will never come back, Even with the full support of the law trying to keep it around.
While that may hypothetically be the future, that time is not now. The case before the court is much more consequential in nature and may remove the need for partnerships between companies like OpenAI and ShutterShock for consenstual sharing. It must be adjudicated based on the current capabilities, facts and circumstances of the case. Does IP and copyright go out the window because computers happen to be good at taking in and transforming like a human? The complaint isn't hiding the implementation, they explained it in a high level reasonably well. US courts are some of the strictest when it comes to IP, largely in part because strict laws were passed by the legislature. The court's job is not to legislate from the bench, but adjudicate violations of the laws as they are written today.
Just because it's easy to speed on a road and other people are speeding and they aren't charged doesn't mean you can't be held liable for speeding if you're caught. Likewise, copying proprietary source code from another project into your own commercial software can seem innocuous until you get hit with a lawsuit. You "prove" it the same way you prove any other thing before a court. The jury/judge doesn't need to be absolutely certain that something did or did not happen, just that the plaintiffs prove it beyond the standard of proof.
So yes: we may as well enter a point that computers are virtually indistinguishable from humans in generating "novel" things, in that case the existing laws would need to change. In the meantime, I don't view AI models as a trump card around copyright/IP. If everyone else is following the rules of the road but you decide to stick it and do your own thing, don't expect to ram your way through without consequences.
Good, how about training on variations of real images? Variations should not be copyrightable since they contain no human input, and they should be sufficiently different from the originals. So the trained model can't possibly reproduce exactly any original because it hasn't seen one.
How do you make the variations though? Start uo a new company called StableVariations and let them deal with the Getty lawsuit? To create the variations you're still using the original images in a commercial way.
Also variations have to be extreme in order to not violate copyrights, and even then may still violate. Otherwise youtube would have no issue with me uploading all of Shrek as long as I mirrored the video and pitched the audio up by 3%.
> you're still using the original images in a commercial way.
Thats not illegal.
For example, right now, for whatever job you are doing, you have probably looked at copyrighted works, that you don't own.
You have copied those copyrighted works, because in order for you to view the image that you don't own, you had to download it to your computer.
So, you have thus used copyrighted works, for commercial purposes, if you have ever looked at copyrighted works on your computer, for a professional purpose.
What is infringement, is not using copyrighted works for a professional purpose. Instead it is distributing copyrighted works to other people that is infringement.