Surely there's a meaningful question about copying and distributing content verb...

CuriouslyC · on Dec 30, 2023

Not really. Models are a device capable of producing protected content given some input contortions. So are Xerox machines.

8organicbits · on Dec 30, 2023

If I Xerox'd a book and sold copies to people I'm clearly violating copyright. I'm not sure I follow.

CuriouslyC · on Dec 30, 2023

Nobody has given Xerox an injunction against researching or building copiers because you can copy books and sell them.

8organicbits · on Dec 30, 2023

Right. If a publisher found a specific Xerox machine was being used to copy and commercially distribute a book, in violation of copyright, they'd ask for an injunction on the person doing that. With OpenAI, the NY Time can see their copyrighted material on both the input (training) side and distributed output (generated) side of a specific LLM implementation. So they cry foul on OpenAIs actions, not LLM in general.

There appears to be an open question about if the LLM can freely ingest copyrighted material and output it verbatim without violating copyright. That seems like an obvious "no" to me, unless we decide that LLM has special treatment.

sgt101 · on Dec 30, 2023

Also the use of the content as per provision on the web.

NYT is paywalled - you have to agree to a license to access it, there are exclusions in that agreement that I don't understand but I think may be important in this discussion!