I mean, yeah, if you omit any objectionable detail and describe it in the most generic possible terms then of course the comparison sounds tasteless and offensive. Consider that collecting child pornography is also "storing the result of an HTTP GET".
It’s really hard to parse this thread because you and the other gentleman keep telling anyone who engages they aren’t engaging.
You both seem worked up and perceiving others as disagreeing with you wholesale on the very concept that AI companies could be forced to compensate people for training data, and morally injuring you.
Your conduct to a point, but especially their conduct, goes far beyond what I’m used to on HN. I humbly suggest you decouple yourself a bit from them, you really did go too far with the slavery bit, and it was boorish to then make child porn analogy.
If you believe my conduct here is inappropriate, feel free to alert the mods. I think it's pretty obvious why describing someone's objections to AI training data as "storing the result of an HTTP GET" is not a good faith engagement.
Yes, they're objecting to people training on data they don't have the right to, not just the GET request as you suggest.
If you distribute child porn, that is a crime. But if you crawl every image on the web and then train a model that can then synthesize child porn, the current legal model apparently has no concept of this and it is treated completely differently.
Generally, I am more interested in how this effects copyright. These AI companies just have free reign to convert copyrighted works into the public domain through the proxy of over-trained AI models. If you release something as GPL, they can strip the license, but the same is not true of closed-source code which isn't trained on.