Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

1. Equality under the law is important in its own right. Even if a law is wrong, it isn’t right to allow particular corporations to flaunt it in a way that individuals would go to prison for.

2. GPL does not allow you to take the code, compress it in your latent space, and then sell that to consumers without open sourcing your code.



> GPL does not allow

Sure, that's what the paper says. Most people don't care what that says until some ramifications actually occur. E.g. a cease and desist letter. Maybe people should care, but companies have been stealing IP from individuals long before GPL, and they still do.


> 2. GPL does not allow you to take the code, compress it in your latent space, and then sell that to consumers without open sourcing your code.

If AI training is found to be fair use, then that fact supercedes any license language.


Whether AI training in general is fair use and whether an AI that spits out a verbatim copy of something from the training data has produced an infringing copy are two different questions.

If there is some copyrighted art in the background in a scene from a movie, maybe that's fair use. If you take a high resolution copy of the movie, extract only the art from the background and want to start distributing that on its own, what do you expect then?


Training seems fine. I learn how to write something by looking at example code, then write my own program, that's widely accepted to be a fair use of the code. Same if I learn multiple things from reading encyclopedias, then write an essay, that's good.

However if I memorise that code and write it down that's not fair use. If I copy the encyclopedia that's bad.

The problem then comes into "how trivial can a line be before it's copyrighted"

    def main():
      print("This is copyrighted")
    main()
This is a problem in general, not just in written words. See the recent Ed Sheeran case - https://www.bbc.co.uk/news/articles/cgmw7zlvl4eo


Fair use is a case by case fact question dependent on many factors. Trial judges often get creative in how they apply these. The courts are not likely to apply a categorical approach to it like that despite what some professors have written.


> Even if a law is wrong, it isn’t right to allow particular corporations to flaunt it in a way that individuals would go to prison for.

No one goes to prison for this. They might get sued, but even that is doubtful.


Aaron Swartz would probably disagree.

https://en.wikipedia.org/wiki/Aaron_Swartz


Hell you don't even have to actually break any copyright law and you'll still find yourself in jail: https://en.wikipedia.org/wiki/United_States_v._Elcom_Ltd.


Just flat out false, and embarrassingly so, but spoken with the unearned authority of an LLM. See: The Pirate Bay.


> 1. Equality under the law is important in its own right. Even if a law is wrong, it isn’t right to allow particular corporations to flaunt it in a way that individuals would go to prison for.

We're talking about the users getting copyright-laundered code here. That's a pretty equal playing field. It's about the output of the AI, not the AI itself, and there are many models to choose from.


> there are many models to choose from.

There don’t seem to be any usable open-source models.


What does "usable" mean? Today's best open source or open weight model is how many months behind the curve of closed models? Was every LLM unusable for coding at that point in time?


By “usable”, I mean “there is a website where I can sign up and chat with the model”.


https://openrouter.ai/chat https://t3.chat/

Do these not have the options you're looking for?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: