Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It requires the original project to not be in the training data for the model for it to be a clean room rewrite
 help



That only matters if expression of the original project really does end up in the rewrite, doesn't it? This can be checked for (by the team with access to the code) and it's also quite unlikely at least. It's not trivial at all to have an LLM replicate their training verbatim: even when feasible (the Harry Potter case, a work that's going to be massively overweighted in training due to its popularity) it takes very specific prompting and hinting.

> That only matters if expression of the original project really does end up in the rewrite, doesn't it?

No, I don't think so. I hate comparing LLMs with humans, but for a human being familiar with the original code might disqualify them from writing a differently-licensed version.

Anyway, LLMs are not human, so as many courts confirmed, their output is not copyrightable at all, under any license.


Uh, this is just a curiosity, but do you have a reference for that last argument?

If true, it would mean most commercial code being developed today, since it's increasingly AI-generated, would actually be copyright-free. I don't think most Western courts would uphold that position.



The headline was misleading. The courts avoided to decide what Thaler could have copyrighted because he said he was not the author.

> That only matters if expression of the original project really does end up in the rewrite, doesn't it?

If that were the case, nobody would bother with clean-room rewrites.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: