I really like Clawdbots safety gloves off approach - no handholding or just saying yes to every permission.
I set it up on a old macbook pro I had that had a broken screen and it works great. Now I just message my server using telegram and it does research for me, organizes my notes, and builds small apps on the fly to help with learning.
However security is a real concern. I need to understand how to create a comprehensive set of allowlists before expanding into anything more serious like bill payments or messaging people / etc
Some great life lessons here, but also some I don't agree with:
- The lazy person works twice as hard.
Often I found you can save a lot of time just trying to the minimal possible and gain a lot of insights of why something is minimal vs not
-The opinion of the person who rarely offers it is listened to more closely.
I found the opposite to be true, those who don't offer their thoughts frequently are often dismissed when they do want to share something
Anyway, many of the points are great.. I would also add to keep a journal and write down what was meaningful throughout the day.. you will find time passing by with more quality since you know what the take and what to avoid
Just because it is in C, doesn't mean you will get C like performance. Just look at the benchmarks, it is 8x slower than just using PyTorch... while I get its cool to use LLMs to generate code at this level, getting super high performing optimized code is very much out of the domain of current frontier LLMs
The PyTorch version is using the GPU (with Metal Performance Shaders); this C version is currently using (in the docs I saw) a single CPU core, with AMX (via Apple Accelerate BLAS) but not yet with OpenMP for parallelism. It’s not slow because LLM code is bad, but because it’s not running on the same hardware. That said, it’s also not as fast as it is because of the LLM—all the critical code is in kernel libraries it calls (the same as for PyTorch).
Absolutely true, but now I'll focus on making it fast and I believe it will be possible to go much faster. I left the agent working in the night with a specification and now I'm going to see the progresses and restart the work.
No it’s not. I have written cuda kernels and 8bit optimizers with this.
They’re actually very good at speed optimization and can iterate very quickly taking notes on trials and failures and benchmarks. I’ve had it write 10 different attempts in around an hour and benchmark them all then merge and beat very strong baselines in torch
I really liked the approach of getting new topics to research via embeddings, trails, and claude code, but often what will this give you outside of novelty?
“Decompression” is a metaphor, not a fact claim to be proved; it is a description of an approach to generating a dataset from an LLM where most of the potential utility is still fairly explicitly speculative, a jumping off point for further work.
FWIW I have the €20 Pro plan and exchange maybe 20 messages with Opus (with thinking) every day, including one weeks-long conversation. Plus a few dozen Sonnet tasks and occasionally light weight CC.
I'm not a programmer, though - engineering manager.
Sure I do, but not as part of any tools, just for one-off conversations where I know it's going to be the best out there. For tasks where reasoning helps little to none, it's often still number one.
People basically want a life coach, someone by their side who can tell them what the best next thing to do is at any given moment. Everything else are just approximation of that ideal.
The author's .txt file works because its simplicity forces a daily ritual of self-coaching. The tool demands that the user manually review, prioritize, and decide what matters. There are no features to hide behind, only the discipline of the process itself.
The impulse to use complex apps or build custom scripts is the attempt to engineer a better coach. We try to automate the prioritization and reminders, hoping the system can do the coaching for us.
The great trap, of course, is when we fall in love with engineering the system instead of doing the work. This turns productivity into a sophisticated form of procrastination.
Ultimately, the best system is the one that removes the most friction between decision and action. For the author, that meant stripping away everything but the list itself.
I was a really big fan of taskwarrior for the simple reason that it did do an approximation of telling you the best thing by calculating urgency, based on a simple weighting method where "the most urgent tasks" blocked other tasks, were due soon, had extra tags, had dependents, and were the oldest.
But I do feel very strongly that people only jump into "the great trap" because they feel that they were let down by their system, or that it didn't quite model their life accurately. A lot of todo apps are opinionated and those opinions, if incompatible with the the person using them, will lead to frustration. The quest for a more perfect life model often continues when this incompatibility is found.
Everything you are saying was something I suspected to be true - I think you've captured it brilliantly. Really like:
"Ultimately, the best system is the one that removes the most friction between decision and action."
> Ultimately, the best system is the one that removes the most friction between decision and action. For the author, that meant stripping away everything but the list itself.
While I agree with the author's vision for a more human-centric AI, I think we're closer to that than the article suggests. The core issue is that the default behavior is what's being criticized. The instruction-following capabilities of modern models mean we can already build these Socratic, guiding systems by creating specific system prompts and tools (like MCP servers). The real challenge isn't technical feasibility, but rather shifting the product design philosophy away from 'magic button' solutions toward these more collaborative, and ultimately more effective, workflows
I set it up on a old macbook pro I had that had a broken screen and it works great. Now I just message my server using telegram and it does research for me, organizes my notes, and builds small apps on the fly to help with learning.
However security is a real concern. I need to understand how to create a comprehensive set of allowlists before expanding into anything more serious like bill payments or messaging people / etc
reply