I don't think that's a fair analogy. One forces 99% of websites to make a change...

cma · on Dec 30, 2023

I'm gonna guess it often isn't even their content but is user content they are protecting. So, sounds like a big subsidy/protection racket for Twitter or whatever to train on their users' public content but not let others.

Too · on Dec 30, 2023

If I understand parent correctly, the restriction flag is opt-in? This turns copyright around completely, expecting every small content producer to implement something they likely are not aware of and secondly do not technically understand.

Kim_Bruning · on Dec 31, 2023

At very least robots.txt is from 1994; it has been part of the web almost from the start (web became public in 1991, so within 3 years).

Claiming ignorance here would be just a little bit disingenuous.

f38zf5vdt · on Dec 30, 2023

The X-Robots-Tags header already exists as "noai" and "noimageai". Scraping software like img2dataset respects these by default.