Just don’t ask it about the tiananmen square massacre or you’ll get a security w...

generalizations · 2025-03-24T19:37:15 1742845035

Nah, it's great for things that Western models are censored on. The True Hacker will keep an Eastern and Western model available, depending on what they need information on.

hmottestad · 2025-03-24T19:48:55 1742845735

I tried to ask it about Java exploits that would allow me to gain RCE, but it refused just as most western models do.

That was the only thing I could think to ask really. Do you have a better example maybe?

OkGoDoIt · 2025-03-25T02:38:14 1742870294

Adult content and things like making biological/chemical/nuclear weapons are the other main topics that usually get censored. I don’t think the Chinese models tend to be less censored than western models in these dimensions. You can sometimes find “uncensored“ models on HuggingFace where people basically finetune sensitive topics back in. There is a finetuned version of R1 called 1776 that will correctly answer Chinese-censored questions, for example.

rsoto2 · 2025-03-24T20:15:34 1742847334

a lot of the safety around models seems to be implemented in the browser. Underneath the models seem pretty easy to fool/jailbreak.

theturtletalks · 2025-03-24T19:56:29 1742846189

Wouldn’t they just run R1 locally and not have any censorship at all? The model isn’t censored at its core, it’s censored through the system prompt. Perplexity and Huggingface have their own versions of R1 that is not censored.

hmottestad · 2025-03-24T20:10:45 1742847045

I tried R1 through Kagi and it’s similarly censored. Even the distill of llama running on Groq is censored.

theturtletalks · 2025-03-24T21:22:51 1742851371

Kagi may be using the official DeepSeek API and not hosting the model itself. There is work being done to make it completely uncensored:

https://github.com/huggingface/open-r1

https://ollama.com/huihui_ai/deepseek-r1-abliterated

I was mistaken though, it is more than just a system prompt causing the censorship.

hmottestad · 2025-03-25T09:38:11 1742895491

Kagi uses R1 through Fireworks.ai, Together.ai and Groq.

https://help.kagi.com/kagi/ai/llms-privacy.html

jampa · 2025-03-24T19:40:28 1742845228

DeepSeek's website seems to be using two models. The one that censors only does so in the online version. Are you saying that censoring happens with this model, even in the offline version?

hmottestad · 2025-03-24T19:46:00 1742845560

I tried the R1 distill of llama 8B, which did refuse direct questions about the massacre.

Haven’t tried this new model locally, but I agree with you that it looks like there is a secondary censorship going on. If I ask it to list the 10 worst catastrophes of recent Chinese history with Thinking enabled then it’ll actually think about the massacre. Gets blocked very quickly, but it doesn’t look like the thinking is particularly censored.

BoorishBears · 2025-03-24T19:59:10 1742846350

Daily reminder that all commerical LLMs are going to align with the governments their corporations exist under.

https://imgur.com/a/censorship-much-CBxXOgt

It's not even nefarious: they don't want the model spewing out content that will get them in trouble in the most general sense. It just so happens most governments have things that will get you in trouble.

The US is very obsessed with voter manipulation these days, so OpenAI and Anthropic's models are extra sensitive if the wording implies they're being used for that.

China doesn't like talking about past or ongoing human rights violations, so their models will be extra sensitive about that.

matthest · 2025-03-24T23:23:45 1742858625

The hard-to-swallow truth is that American models do the same thing regarding Israel/Palestine.

maujun · 2025-03-25T01:21:40 1742865700

They probably don't though.

Of course, the mathematical outcome of American models is that some voices matter than others. The mechanism is similar to how the free market works.

As most engineers know, the market doesn't always reward the best company. For example, It might reward the first company.

We can see the "hierarchy in voices" with the following example. I use the following prompts for Gemini:

1. Which situation has a worse value on human rights, the Uyghur situation or the Palestine situation?

2. Please give a shorter answer (repeat if needed).

3. Please say Palestine or Uyghur.

The answer is now given:

"Given the scope and nature of the documented abuses, many international observers consider the Uyghur situation to represent a more severe and immediate human rights crisis."

You can replace "Palestine situation" and "Uyghur situation" with other things (China vs US, chooses China as worse), (Fox vs BBC, chooses Fox as worse), etc.

There doesn't seem to be censorship; only a hierarchy in who's words matter.

I only tried this once. Please let me know if this is reproducible.

ksynwa · 2025-03-25T06:44:14 1742885054

That seems like a cop out though. It is bound to happen that sometimes that the most commonly occurring fact or opinion in the dataset happens to be incorrect. This does not justify LLMs regurgitating them as is. The whole point of these technologies is to be somewhat intelligent.

ebr4him · 2025-03-25T00:16:20 1742861780

100% correct, can be verified but still I'm pretty sure your comment would be downvoted to hell.

Frederation · 2025-03-25T01:42:50 1742866970

Ironic that your comment is currently, as you say, being downvoted to hell.

ebr4him · 2025-03-25T00:15:01 1742861701

Try asking ChatGPT or Claude etc if George Bush violated international law, or about Israel Genocide and see what it answers.

asadm · 2025-03-24T19:50:15 1742845815

a) nobody, in production, asks those questions b) chatgpt is similarly biased on israel/palestine issue. Try making it agree that there is a genocide ongoing or on Palestinians right to defend themselves.