Yeah, honestly this is a bad move on anthropic's part. I don't think their moat is as big as they think it is. They are competing against opencode + ACP + every other model out there, and there are quite a few good ones (even open weight ones).
Opus might be currently the best model out there, and CC might be the best tool out of the commercial alternatives, but once someone switches to open code + multiple model providers depending on the task, they are going to have difficulty winning them back considering pricing and their locked down ecosystem.
I went from max 20x and chatgpt pro to Claude pro and chat gpt plus + open router providers, and I have now cancelled Claude pro and gpt plus, keeping only Gemini pro (super cheap) and using open router models + a local ai workstation I built using minimax m2.1 and glam 4.7. I use Gemini as the planner and my local models as the churners. Works great, the local models might not be as good as opus 4.5 or sonnet 4.7, but they are consistent which is something I had been missing with all commercial providers.
> I went from max 20x and chatgpt pro to Claude pro and chat gpt plus + open router providers, and I have now cancelled Claude pro and gpt plus, keeping only Gemini pro (super cheap) and using open router models + a local ai workstation I built using minimax m2.1 and glam 4.7. I use Gemini as the planner and my local models as the churners. Works great, the local models might not be as good as opus 4.5 or sonnet 4.7, but they are consistent which is something I had been missing with all commercial providers.
You went from a 5 minute signup (and 20-200 bucks per month) to probably weeks of research (or prior experience setting up workstations) and probably days of setup. Also probably a few thousand bucks in hardware.
I mean, that's great, but tech companies are a thing because convenience is a thing.
My first switch was to open code + open router. I used it to try mixing models for different tasks and to try open weights models before committing to the hardware.
Even paying API pricing it was significantly cheaper than the nearly $500 I was paying monthly (I was spending about $100 month combined between Claude pro, chat gpt plus, and open router credits).
Only when I knew exactly the setup I wanted locally did I start looking at hardware. That part has been a PITA since I went with AMD for budget reasons and it looks like I'll be writing my own inference engine soon, but I could have gone with Nvidia and had much less issues (for double the cost, dual Blackwell's vs quad Radeon W7900s for 192GB of VRAM).
If you spend twice what I did and go Nvidia you should have nearly no issues running any models. But using open router is super easy, there are always free models (grok famously was free for a while), and there are very cheap and decent models.
All of this doesn't matter if you aren't paying for your AI usage out of pocket. I was so Anthropics and OpenAIs value proposition vs basically free Gemini + open router or local models is just not there for me.
I see a ton of my peers driving around in 80k cars. I drive a 20k used one.
I'm planning a writing a ROCM inference engine anyways, or at least contributing to the rocm vllm or sglang implementations for my cards since I'm interested in the field. Funnily enough, I wouldn't consider myself bullish on AI, I just want to really learn the field so I can evaluate where it's heading.
I spent about 10k on the cards, though the upgrades were piece meal as I found them cheap. I still have to get custom water blocks for them since the original W7900s (which are cheap) are triple slot, so you can't fit 4 of them in any sort of workstation setup (I even looked at rack mount options).
Bought a used thread ripper pro wrx80 motherboard ($600), I bought the cheapest TR Pro CPU for the MB (3945wx, $150), I bought 3 128Gb DDR4-3200 sticks at 230 each before the craze, was planning on populating all 8 channels if prices went down a bit. Each stick is now 900, more than I paid for all 3 combined (730 with S&H and taxes). So the system is staying as is until prices come down a bit.
For AI assisted programming, the best value prop by far is Gemini (free) as the orchestrator + open code using either free models or grok / minimax / glm through their very cheap plans (for minimax or glm) or open router which is very cheap. You can also find some interest providers like Cerebras, who get silly fast token generation, which enables interesting cases.
OP mentioned a local LLM rig plus a rather complex setup from what I understood (I could be wrong).
Also, most of the lower end models aren't that good. At this point you can take an experienced dev and start implementing an app using any mature stack (I'm talking even about stuff like Ada, so not just C/C++, JS, etc) on any mainstream platform (big 3 desktop + big 2 mobile + web) and you can get quite far with Claude Code. By far I mean you'll at least do the 80% really quickly, at which point the "experienced dev" needs to take over. I think you can even get 95% of the way there. And that's with a stack that the dev is unfamiliar with at the start.
Claude code has aggressive limits on $18/month plan. You can get much farther with Minimax 2.1 or Qwen3 on the same amount of money. I have noticed opus is much better in some scenarios but Minimax and Q3 are not as bad or behind. Or my setup is now too used to the quirks of those models, who knows. I am using Claude Haiku as small model anyway, just not using Opus because it burns credits very quickly.
I signed up to Claude Pro when I figured out I could use it on opencode, so I could start things on Sonnet/Opus on plan mode and switch to cheaper models on build mode. Now that I can't do that, I will probably just cancel my subscription and do the dance between different hosted providers during plan phase and ask for a prompt to feed into opencode afterwards.
can you point me to this claim? also last i checked trying to connect to OpenAI seems to prompt for an API key, does openAI's API key make use of the subscription quota?
just wanted to make sure before I sign up for a openAI sub
I like how I can cycle through agents in OpenCode using tab. In CC all my messages get interpreted by the "main" agent; so summoning a specific agent still wastes main agent's tokens. In OpenCode, I can tab and suddenly I'm talking to a different agent; no more "main agent" bs.
I thought of that! but CC is so clunky and slow that it doesn't make sense to do this. with codex cli tho, that's a totally different story cause it's written in Rust.