Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Surely there's a meaningful question about copying and distributing content verbatim, which GPT has been shown to do.


Not really. Models are a device capable of producing protected content given some input contortions. So are Xerox machines.


If I Xerox'd a book and sold copies to people I'm clearly violating copyright. I'm not sure I follow.


Nobody has given Xerox an injunction against researching or building copiers because you can copy books and sell them.


Right. If a publisher found a specific Xerox machine was being used to copy and commercially distribute a book, in violation of copyright, they'd ask for an injunction on the person doing that. With OpenAI, the NY Time can see their copyrighted material on both the input (training) side and distributed output (generated) side of a specific LLM implementation. So they cry foul on OpenAIs actions, not LLM in general.

There appears to be an open question about if the LLM can freely ingest copyrighted material and output it verbatim without violating copyright. That seems like an obvious "no" to me, unless we decide that LLM has special treatment.


Also the use of the content as per provision on the web.

NYT is paywalled - you have to agree to a license to access it, there are exclusions in that agreement that I don't understand but I think may be important in this discussion!




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: