Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Really? With invisible characters I wouldn't have thought watermarking text would be a massive achievement.


Heh, yeah that will be a fun little game of cat and mouse to watch


I'd say the arms race of AI- detection/detection-avoidance is inevitable. But it seems we're a way off given how bad common spam detection tools are today (I mean, it's likely that a lot of spam is written using detection-avoidance tech but that seems to make it even more obviously spam to a human reader...)


I think gmail's spam filter is so close that it feels like a solved problem. But of course it has a lot of other indicators to look at than just the email body text and spam tends to revolve around a fairly predictable range of topics. None of that would apply in a "was this webpage written by a robot?" algorithm


I just logged in to check my gmail account (I don't use it that often) - and literally every single message in the last few months is spam, none were flagged as such. I didn't even need to open the message to determine that they were spam, simply the combination of subject line and sender name/address were enough. So why is AI so bad at this?

(The sender name for virtually every single message is either "Lowe's winner", "Kohl's winner", CVS or some big US chain that doesn't even operate in my country. The actual email addresses are even more obviously dodgy. There is one genuine message from DELL asking about a survey related to a product I ordered from them many many months ago. I basically use gmail as my spam trap...I get far far less spam on my hotmail account that's my regular one, but the ones that do get through don't even pass the most basic tests, which I don't understand).

Edit: actually the bigger problem with hotmail is false positives. I just tried ChatGPT out on one such example, and it definitely did a better job:

'It is unlikely that an email beginning with the text "Hi All, Here are a few things you need to know for the Christmas concert" is spam.'

I tried a few others (both genuine and incorrectly-identified spam) and for each attempt ChatGPT got it right just from the subject/first paragraph.

With a bit more questioning I could also get it to identify messages that were "genuine" marketing promotions (from companies I've bought products from and agreed to receiving such messages - which hotmail still identifies as junk)


Interesting and I don't know but if I were to guess I would say that it looks at the type of messages/senders that you interact with and uses that as one of its signals. If the only thing you get sent to your gmail is spam or no-reply messages and you never reply to any of it I suppose it has nothing to go on and just assumes that it is all weighted equally.

But like I say, that's 100% guesswork, based on the 50 emails in my inbox being not-spam and 16/18 emails in my spam folder being spam (and thanks for prompting me to check, btw!)


But it clearly does have something to go on - the gmail messages of millions of other users! And as I noted in an edit, ChatGPT can fairly easily tell them apart, even just from the subject line/first paragraph (and even explains why it believes they are or aren't spam).


But isn't google's whole schtick about creating a personal profile just about you? Maybe they think you are in the market for some offshore v!agra or would like to get in on the ground floor of some crypt0 investment opportunity...


If it's that trivial to find them (with code, not eyeballs), it's also trivial to remove them.


I assume most people would not be that sophisticated, but I understand the point. This seems like it would be an ongoing battle no matter what OpenAI does though.


or paste into notepad.exe, copy back into whatever you were using.

Voila!


That will almost certainly preserve the invisible characters. Most invisible characters are used for some kind of in-line formatting in Unicode, so it's not desirable to remove them.


What inline formatting in notepad.exe? It doesn't even support bolding/italics/underling.

But I guess there are tabs and line return/carriage returns, so there's that.


Right-to-left/left-to-right markers. Language tags. Various invisible spaces. Homoglyphs. (all trivially filterable though)


I've already got a script running every 2500 milliseconds to strip leading and trailing whitespace, HTML, and non-ASCII characters except for the UTF-8 characters of our local language.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: