True, but there's still a meaningful difference in friction and scale. With closed APIs, OpenAI can monitor for misuse, throttle abuse and deploy countermeasures in real-time. With open weights, a single prompt jailbreak or exploit spreads instantly. No need for ML expertise, just a Reddit post.
The risk isn’t that bad actors suddenly become smarter. It’s that anyone can now run unmoderated inference and OpenAI loses all visibility into how the model’s being used or misused. I think that’s the control they’re grappling with under the label of safety.
Given that the best jailbreak for an off-line model is still simple prompt injection, which is a solved issue for the closed source models… I honestly don’t know why they are talking about safety much at all for open source.
I think you're conflating real-time monitoring with data retention. Zero retention means OpenAI doesn't store user data, but they can absolutely still filter content, rate limit and block harmful prompts in real-time without retaining anything. That's processing requests as they come in, not storing them. The NYT case was about data storage for training/analysis not about real-time safety measures.
Ok you're off in the land of "what if" and I can just flat out say: If you have a ZDR account there is no filtering on inference, no real-time moderation, no blocking.
If you use their training infrastructure there's moderation on training examples, but SFT on non-harmful tasks still leads to a complete breakdown of guardrails very quickly.
The risk isn’t that bad actors suddenly become smarter. It’s that anyone can now run unmoderated inference and OpenAI loses all visibility into how the model’s being used or misused. I think that’s the control they’re grappling with under the label of safety.