Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So, basically iocaine (https://iocaine.madhouse-project.org/). It has indeed been very useful to get the AI scraper load on a server I maintain down to a reasonable level, even with its not so strict default configuration.


https://blog.cloudflare.com/ai-labyrinth/

A bit like this? ( iocaine is newer)


If I think about it, I find it awful. The fact that we need to put junk in our own stuff just for crawlers does not sit well with me.


Yup, it's a clown world.

Any functioning society would deal with the offenders directly and had this stopped before it became an issue for most sites.


First time seeing that, but yes, seems similar in concept. Iocaine can be self-hosted and put in as a "middleware" in your reverse proxy with a few lines of config, cloudflare's seems tied to their services. Cloudflares also generates garbage with generative models, while iocaine uses much simpler (and surely more "crude") methods of generating its garbage. Using LLMs to feed junk to LLMs just makes me cry, so much wasted compute.

Is iocaine actually newer though? Its first commit dates to 2025-01, while the blog post is from 2025-03. I couldn't find info on when Cloudflare started theirs. There's also Nepenthes, which had its first release in 2025-01 too.


Yes, except with the content being based on the real content rather than completely random. My intuition says that this will be more effective, specifically poisoning the model wrt tokens relating to that content rather than just increasing the overall noise level a bit (the damage there being smoothed out over the wider model).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: