> There's not much you can do about it, as sibling comment mentions it's a known...

overfeed · 2026-02-20T21:43:30 1771623810

> I supposed groups like the New York Times would update their procedures to not publish leaked documents verbatim or develop technology to scramble the watermark

Like knuckleheads, The Intercept provided the Pentagon a copy of a scanned document they received from a whistleblower, which directly led to Reality Winner's identity being discovered.

gosub100 · 2026-02-20T15:12:19 1771600339

You could do really sneaky things like alter the space between words or other formatting tricks.

ceejayoz · 2026-02-20T15:28:07 1771601287

Print it out, scan it back in, and OCR that.

Then have an AI or intern paraphrase it.

palmotea · 2026-02-20T16:25:48 1771604748

I think that's exactly what will happen.

When a competent journalist gets a leaked document, they'll learn to only summarize it, but won't quote it verbatim or duplicate it. That'll circumvent and kind of passive leak-detection system that could reveal their source.

Then the only thing that would reveal the source is if the authority starts telling suspected leakers entirely different things, to see what gets out.

jihadjihad · 2026-02-20T16:42:55 1771605775

> Then the only thing that would reveal the source is if the authority starts telling suspected leakers entirely different things, to see what gets out.

This is called a canary trap [0], a well-trodden technique in the real world and fiction alike.

0: https://en.wikipedia.org/wiki/Canary_trap

kube-system · 2026-02-20T16:40:14 1771605614

Then you fix that loophole by subtlety altering the phrasing or formatting that you send everyone

ceejayoz · 2026-02-20T16:54:27 1771606467

That's why I said you paraphrase, rather than using the exact phrasing and formatting of the original doc.

SoftTalker · 2026-02-20T16:59:39 1771606779

Include slightly different details in each version. Then if the paraphrase mentions one of them, you've identified the source.

ceejayoz · 2026-02-20T17:46:29 1771609589

Yes, I'm aware of that approach.

It's likely tougher than it seems; the big important bits that the news will care about have to match up when checked, and anyone with high-level access to this stuff likely has a significantly sized staff who also has access to it. Paraphrasing reduces the chance of some minute detail tweak being included in the reporting at all.

You also have to actively expect and plan to do it in advance, which takes a lot of labor, time, and chances of people comparing notes and saying "what the fuck, we're being tested". You can't canary trap after the leak.