Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No legal precedent has been set as of yet. The "precedent" you describe is the argument AI companies have been using (that training their models on information available on the Internet should be considered "fair use") but whether AI training actually satisfies the four-factor test for fair use remains to be seen.


It's a null question. Training itself is neither publication nor distribution, so copyright can't be relevant at that point. "Fair use" just isn't a concept applicable to training.


Training stores a variation of the source material, which is arguably distribution. And selling the result or selling access to it certainly is. So fair use applies, and hoping a court thinks the process is transformative to count as fair use. Given original material can be spat out, my money is on a court thinking this is about as transformative as a compression algorithm.


Selling the result is where it's on dodgy ground. I disagree about storage though.


Exactly. Framing reading as fair use is a huge and dangerous expansion of copyright.


Storing copyright content itself can sometimes be illegal - like ripping a Bluray. What if these frames are now stored on their servers and go into the training dataset?


The illegal bit of ripping a Blu-ray is circumventing the copy protection, not the storage. At least, that's how I've always understood the effect of the DMCA on the situation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: