Is anyone else just completely overwhelmed with the number of things you _need_ ...

levocardia · 2025-12-30T02:57:18 1767063438

I just take a grug brain approach. I do touch CLAUDE.md and then just explain how the code/files/project spec work, like I'm writing a slack message or email to a really smart colleague, and then let it rip, always using biggest model with thinking on. If something consistently goes wrong I add more to CLAUDE.md or even better, have Claude Code just update CLAUDE.md itself with the new issue explained. I'm probably 3 months behind what you could get with absolute SOTA practices but it still works so well that I'm amazed and amused on a daily, if not hourly, basis.

dimitri-vs · 2025-12-30T00:08:22 1767053302

I'm in Claude Code 30+ hr/wk and always have a at least three tabs of CC agents open in my terminal.

Agree with the other comments: pretty much running vanilla everything and only the Playwright MCP (IMO way better than the native chrome integration) and ccstatusline (for fun). Subagents can be as simple as saying "do X task(s) with subagent(s)". Skills are just self @-ing markdown files.

Two of the most important things are 1) maintaining a short (<250 lines) CLAUDE.md and 2) having a /scratch directory where the agent can write one-off scripts to do whatever it needs to.

jswny · 2025-12-30T01:11:57 1767057117

I also specifically instruct Claude how to use a globally git ignored scratch folder “tmp” in each repo. Curious what your approach is

austinbaggio · 2025-12-30T02:09:18 1767060558

You store your project context in an ignored tmp folder? Share more plz - what does it look like? What do you store?

jswny · 2025-12-30T02:40:33 1767062433

Not memory, I just instruct it to freely experiment with temporary scripts and artifacts in a specific folder.

This helps it organize temporary things it does like debugging scripts and lets it (or me) reference/build on them later, without filling the context window. Nothing fancy, just a bit of organization that collects in a repo (Git ignored)

brigandish · 2025-12-30T00:34:42 1767054882

How can you - or any human - review that much code?

Normal_gaussian · 2025-12-30T01:54:02 1767059642

When I'm coding I have about 6 instances of VSCode on the go at once; each with their own worktree and the terminal is a dangerous cc in docker. most of the time they are sitting waiting for me. Generally a few are doing spec work/reporting for me to understand something - sometimes with issue context; these are used to plan or redirect my attention if I might've missed something. A few will be just hacking on issues with little to no oversight - I just want it to iterate tests+code+screenshots to come up with a way to do a thing / fix a thing, I'll likely not use the code it generates directly. Then one or two are actually doing work that I'll end up PR'ing or if I'm reviewing they'll be helping me do the review - either mechanically (hey claude, give me a script to launch n instances with a configuration that would show X ... ok, launch them ... ok, change to this ... grab X from the db ... etc.) or insight based (hey claude, check issue X against code Y - does the code reflect their comments; look up the docs for A and compare to the usage in B, give me references).

I've TL'd and PM'd as well as IC'd. Now my IC work feels a lot more like a cross between being a TL and being a senior with a handful of exuberant and reasonably competent juniors. Lots of reviewing, but still having to get into the weeds quickly and then get out of their way.

everfrustrated · 2025-12-30T18:54:50 1767120890

>I've TL'd and PM'd as well as IC'd. Now my IC work feels a lot more like a cross between being a TL

Interesting... I've been in management for a few years now and recently doing some AI coding work. I've found my skills as a manager/TL are far more adaptable to getting the best out of AI agents than my skills as a coder.

austinbaggio · 2025-12-30T22:19:49 1767133189

Same, I was a very average dev coming out of CS, and a PM before this. I find that my product training has been more useful, especially with prototypes, but I do leave nearly all of the hard system, infra, and backend work to my much much more competent engineering teammates.

dimitri-vs · 2025-12-30T12:09:41 1767096581

TBH I'm not building "production grade" apps depended on by hundreds of thousands of users - our clients want to get to a live MVP as fast as possible and love the ability to iterate quickly.

That said, it's well know that Anthropic uses CC for production. You just slow things down a bit, spend more time on the spec/planning stage and manually approve each change. IMO the main hurdle to broader Claude Code adoption isn't a code quality one, it's mostly getting over the "that's not how I would have written it" mindset.

bpolly · 2025-12-30T01:53:16 1767059596

From personal experience, most of my time in Claude Code is spent experimenting, iterating, and refining approaches. The amount of code it produces as it relates to time spent working on it tends to be pretty logarithmic in practice.

blks · 2025-12-30T12:39:36 1767098376

They don’t, they just push garbage, someone else quickly looks over it (or asks another llm to review for him), and merges.

asdev · 2025-12-29T23:41:26 1767051686

you really don't need any of this crap. you just need Claude Code and CLAUDE.MD in directories where you need to direct it. complicated AI set ups are mid curve

parpfish · 2025-12-30T00:29:33 1767054573

I refuse to learn all the complicated configuration because none of it will matter when they drop the next model.

Things that need special settings now won’t in the future and vice versa.

It’s not worth investing a bunch of time into learning features and prompting tricks that will be obsoleted soon

AndyNemmity · 2025-12-30T00:45:51 1767055551

I wish that were true. Models don't feel like they've really had massive leaps.

They do get better, but not enough to change any of the configuration I have.

But you are correct, there is a real possibility that the time invested with be obsolete at some point.

For sure the work towards MCPs are basically obsolete via skills. These things happen.

parpfish · 2025-12-30T01:55:38 1767059738

It doesn’t require any major improvement to the underlying model. As long they tinker with system prompts and builtin tools/settings, the coding agent will evolve in unpredictable ways out of my control

AndyNemmity · 2025-12-30T01:59:07 1767059947

That's a rational argument. In practice, what we're actually doing for the most part is managing context, and creating programs to run parts of tasks, so really the system prompts and builtin tools and settings have very little relevance.

dnautics · 2025-12-30T01:40:16 1767058816

i don't understand this mcp/skill distinction? one of the mcps i use indexes the runtime dependency of code modules so that claude can refactor without just blindly grepping.

how would that be a "skill"? just wrap the mcp in a cli?

fwiw this may be a skill issue, pun intended, but i can't seem to get claude to trigger skills, whereas it reaches for mcps more... i wonder if im missing something. I'm plenty productive in claude though.

AndyNemmity · 2025-12-30T01:49:16 1767059356

So MCPs are a bunch of, essenntially skill type objects. But it has to tell you about all of them, and information about all of them up front.

So a Skill is just a smaller granulatrity level of that concept. It's just one of the individual things an MCP can do.

This is about context management at some level. When you need to do a single thing within that full list of potential things, you don't need the instructions about a ton of other unrelated things in the context.

So it's just not that deep. It would be having a python script or whatever that the skill calls that returns the runtime dependencies and gives them back to the LLM so they can refactor without blindly greping.

Does that make sense?

dnautics · 2025-12-30T10:09:02 1767089342

no that makes no sense. the skill doesn't do anything by itself, the mcp (can be) attached to a deterministic oracle that can return correct information.

AndyNemmity · 2025-12-30T16:51:41 1767113501

But the skill includes the scripts to do things.

So in my nano banana image generation skill, it contains a python script that does all the actual work. The skill just knows how to call the python script.

We're attaching tools to the md files. This is at the granular level of how to hammer a nail, how to use a screw driver, etc. And then the agent, the handyman, has his tool box of skills to call depending on what he needs.

dnautics · 2025-12-31T05:54:07 1767160447

lets say i'm in erlang. you gonna include a script to unpack erlang bytecode across all active modules and look through them for a function call? oorrr... have that code running on localhost:4000 so that its a single invocation away, versus having the llm copypasta the entire script you provided and pray for the best?

AndyNemmity · 2025-12-31T17:52:38 1767203558

The LLM doesn't copy the script, it runs it.

But for sure, there are places it makes sense, and there are places it doesn't. I'm arguing to maximully use it for places that make sense.

People are not doing this. They are leaving the LLM to everything. I am arguing it is better to move everything possible into tools that you can, and have the LLM focus only on the bits that a program doesn't make sense for.

austinbaggio · 2025-12-30T02:04:49 1767060289

In our experience, a lot of it is feel and dev preference. After talking to quite a few developers, we've found the skill was the easiest to get started with, but we also have a CLI tool and an MCP server too. You can check out the docs if you'd prefer to try those - feedback welcome: https://www.ensue-network.ai/docs#cli-tool

dnautics · 2025-12-30T10:13:46 1767089626

yeah but a skill without the mcp server is just going to be super inefficient at certain things.

again going to my example, a skill to do a dependency graph would have to do a complex search. and in some languages the dependency might be hidden by macros/reflection etc which would obscure a result obtained by grep

how would you do this with a skill, which is just a text file nudging the llm whereas the MCP's server goes out and does things.

AndyNemmity · 2025-12-30T16:52:46 1767113566

A skill is not just a text file nudging the llm. You group scripts and programming to the skill, and the skill calls it.

dnautics · 2025-12-31T05:55:32 1767160532

that seems token inefficient. why have the llm do a full round trip. load the skill which contains the potentially hundreds of lines code then copy and paste the code back into the compiler when it could just run it?

not that i care too too much about small amounts of tokens but depleting your context rapidly seems bad. what is the positive tradeoff here?

AndyNemmity · 2025-12-31T17:54:17 1767203657

I don't understand. The Skill runs the tools. In the cases there are problems where you can have programs replace the LLM, I think we should maximully do that.

That uses less tokens. The LLM is just calling the script, and getting the response, and then using that to continue to reason.

So I'm not exactly following.

dnautics · 2026-01-03T02:18:27 1767406707

what you are proposing is functionally equivalent to "wrapping an mcp in a cli" which is what I mentioned in my root comment.

wouldbecouldbe · 2025-12-30T00:06:47 1767053207

It seems to mostly ignore Claude.md

songodongo · 2025-12-30T01:26:25 1767057985

If you can test how often it is being used by having a line in there saying something like “You must start every non-code response with ‘Woohoo!’”

csar · 2025-12-30T03:30:23 1767065423

It’s told to only use it if relevant because most people write bad ones. Someone should write a tool to assess CLAUDE.md quality.

AndyNemmity · 2025-12-30T00:10:33 1767053433

It does, Claude.md is the least effective way to communicate to it.

It's always interesting reading other people's approaches, because I just find them all so very different than my experience.

I need Agents, and Skills to perform well.

_the_inflator · 2025-12-30T00:15:43 1767053743

I like the finetuning aspect to it quite a lot. It makes sense to me. What I achieved now is a very streamlined process of autonomous work of an agent, which can more and more often be simply managed than controlled on a code review level basis for everything.

I agree that this level of finetuning feels overwhelming and might let yourself doubting whether you do utilize Claude to its optimum and the beauty is, that finetunging and macro usage don't interfere, when you stay in your lane.

For example I now don't use the planing agent anymore instead incorporated this process into the normal agents much to the project's advantage. Consistency is key. Anthropic did the right thing.

Codex is quite a different beast and comes from the opposite direction so to say.

I use both, Codex and Claude Opus especially, in my daily work and found them complementary not mutual exclusive. It is like two different evangelists who are on par exercising with different tools to achieve a goal, that both share.

AndyNemmity · 2025-12-30T01:52:15 1767059535

Yeah, at a certainly level, it's just a ton of fun to do. I think that's why so many of us are playing with it.

It's also deeply interesting because it's essentially unsolved space. It's the same excitement as the beginning of the internet.

None of us know what the answers will be.

wouldbecouldbe · 2025-12-30T00:06:08 1767053168

All I use is curse words and it does a damn great job most of the time

lobito25 · 2025-12-30T00:43:09 1767055389

Same here :)))), he's really good at understanding when you're pissed off.

nineteen999 · 2025-12-30T02:26:52 1767061612

I thought I was the only one.

anonzzzies · 2025-12-30T00:30:48 1767054648

Yep, that usually works best.

eterm · 2025-12-29T23:44:24 1767051864

This isn't necessary. Claude will read CLAUDE.md from both:

  1. Current directory ./CLAUDE.md
  2. User directory ~/.claude/CLAUDE.md

I stick general preferences in what it calls "user memory" and stick project specific preferences in the working directory.

austinbaggio · 2025-12-29T23:56:46 1767052606

It feels like Claude is taking more of the Android approach of a less opinionated, but more open stack, so people are bending it to the shape they want to match their workflow. I think of the amnesia problem as pretty agent-agnostic, though, knowing what happens while you're delivering product is more of an agent execution layer problem than a tool problem, and it gets bigger when you have swarms coordinating - Jaya wrote a pretty good article about this https://x.com/AustinBaggio/status/2004599657520123933?s=20

AndyNemmity · 2025-12-30T00:09:16 1767053356

I'm the opposite, I find it straight forward to use all these things, and am surprised people aren't getting it.

I've been trying to write blogs explaining it recently, but I don't think I'm very good at making it sound interesting to people.

What can I explain that you would be interested in?

Here was my latest attempt today.

https://vexjoy.com/posts/everything-that-can-be-deterministi...

majormajor · 2025-12-30T00:13:44 1767053624

You say "My Claude Code Setup" but where is the actual setup there? I generally agree with everything about how LLMs should be called you say, but I don't see any concrete steps of changing Claude Code's settings in there? Where are the "35 agents. 68 skills. 234MB of context."? Is the implementation of the "Layer 4" programs intended to be left to the reader? That's hardly approachable.

AndyNemmity · 2025-12-30T00:16:10 1767053770

I got similar feedback with my first blog post on my do router - https://vexjoy.com/posts/the-do-router/

Here is what I don't get. it's trivial to do this. Mine is of course customized to me and what I do.

The idea is to communicate the ideas, so you can use them in your own setup.

It's trivial to put for example, my do router blog post in claude code and generate one customized for you.

So what does it matter to see my exact version?

These are the type of things I don't get. If I give you my details, it's less approachable for sure.

The most approachable thing I could do would be to release individual skills.

Like I have skills for generating images with google nano banana. That would be approachable and easy.

But it doesn't communicate the why. I'm trying to communicate the why.

majormajor · 2025-12-30T00:31:24 1767054684

I just don't have much faith in "if you're doing it right the results will be magically better than what you get otherwise" anymore. Any single person saying "the problems you run into with using LLMs will be solved if you do it my way" has to really wow me if they want me to put in effort on their tips. I generally agree with your why of why you set up like that. I'm skeptical that it will get over the hump of where I still run into issues.

When you've tried 10 ways of doing it but they all end up getting into a "feed the error back into the LLM and see what it suggests next" you aren't that motivated to put that much effort into trying out an 11th.

The current state of things is extremely useful for a lot of things already.

AndyNemmity · 2025-12-30T00:37:29 1767055049

That's completely fair, I also don't have much faith in that anymore. Very often, the people who make those claims have the most basic implementation that barely is one.

I'm not sure if the problems you run into with using LLMs will be solved if you do it my way. My problems are solved doing it my way. If I heard more about your problems, I would have a specific answer to them.

These are the solutions to where I have run into issues.

For sure, but my solutions are not feed the error back into the LLM. My solutions are varied, but as the blog shows, they are move as much as possible into scripts, and deterministic solutions, and keep the LLM to the smallest possible scope.

The current state of things is extremely useful for a subset of things. That subset of things feels small to me. But it may be every thing a certain person wants to do exists in that subset of things.

It just depends. We're all doing radically different things, and trying very different things.

I certainly understand and appreciate your perspective.

majormajor · 2025-12-30T02:55:04 1767063304

That makes sense.

My basic problem is: "first-run" LLM agent output frequently does one or more of the following: fails to compile/run, fails existing test coverage, or fails manual verification. The first two steps have been pretty well automated by agents: inspect output, try to fix, re-run. IME this works really well for things like Python, less-well for things like certain Rust edge cases around lifetimes and such, or goroutine coordination, which require a different sort of reasoning than "typical" procedural programming.

But let's assume that the agents get even better at figuring out the deal with the more specialized languages/features and are able to iterate w/o interaction to fix things.

If the first-pass output still has issues, I still have concerns. They aren't "I'm not going to use these tools" concerns, because I also sometimes write bugs, and they can write the vast majority of code faster than I can.

But they are "I'm not gonna vibe-code my day job" concerns because the existence of trivially-catchable issues suggests that there's likely harder-to-catch issues that will need manual review to make sure (a) test coverage is sufficient, (b) the mental model being implemented is correct, (c) the outside world is interacted with correctly. And I still find bugs in these areas that I have to fix manually.

This all adds up to "these tools save me 20-30% of my time" (the first-draft coding) vs "these agents save me 90% of my time."

So I'm kinda at a plateau for a few months where it'll be hard to convince me to try new things to try to close that 20-30% -> 90% number.

AndyNemmity · 2025-12-30T03:53:04 1767066784

I experience the same things. What I’ve found is there is no issue I can’t solve so it doesn’t repeat.

The real issue is I don’t know the issues ahead of time. So each experience is an iteration stopping things I didn’t know would happen.

Thankfully, I’m not trying to sell anyone anything. I don’t even want people to use what I use. I only want people to understand the why of what I do, and how it adds me value.

I think it’s important to understand this thing we use as best we can.

The personal value you can get, is entirely up to your tolerance for it.

I just enjoy the process

csar · 2025-12-30T03:36:51 1767065811

For new-ish projects it should give you some crazy speed up out of the box.

For large codebases (my own has 500k lines and my company has a few tens of millions) you need something better like RPI.

If nothing else just being able to understand code questions basically instantly should give you a large speed up, even without any fancy stuff.

ok_dad · 2025-12-30T02:25:16 1767061516

Damn, it really is all just vibes eh? Everyone just vibes their way to coding these days, no proof AI is actually doing anything for you. It's basically just how someone feels now: that's reality.

In some sense, computers and digital things have now just become a part of reality, blending in by force.

AndyNemmity · 2025-12-30T02:37:47 1767062267

I mean, it’s not vibes. I make real projects, and the failures of AI doing it force me to make fixes so that it only ever fails doing that thing once. Then it no longer fails to do that thing.

But the things I am doing might not be the things you are doing.

If you want proof, I intend to release a game to the App Store and steam soon. At that point you can judge if it built a thing adequately.

ok_dad · 2025-12-30T02:50:51 1767063051

No offense intended, I don't even know you at all, but I see people claim things like you did so often these days that I begin to question reality. These claims always have some big disclaimer, as yours does. I still don't know a single personal acquaintance who has claimed even a 2x improvement on general coding efficiency, not even 1.5x in general efficiency. Some of my coworkers say AI is good for this or that, but I literally just waste my time and money when I use it, I've never gotten good results or even adequate results to continue trying. I feel like I am taking crazy pills sometimes with all of the hype!

I hope you're just one of the ones who figured it out early and all the hype isn't fake bullshit. I'd much rather be proven wrong than for humanity to have wasted all this time and resources.

AndyNemmity · 2025-12-30T03:56:22 1767066982

I think the correct approach is to be skeptical. You should push back.

I think of this stuff as trivial to understand from my point of view. I am trying to share that.

I have nothing to sell, I don’t want anyone to use my exact setup.

I just want to communicate the value as I see it, and be understood.

The vast majority of it all is complete bullshit, so of course I am not offended that I may sound like 1000 other people trying to get you to download my awesome Claude Code Plugins repo.

Except I’m not actually providing one lol

ok_dad · 2025-12-30T05:07:31 1767071251

Yea sorry if I did a bit of a rant there.

AndyNemmity · 2025-12-30T06:26:17 1767075977

Nah, you’re good. We’re all working through this craziness together

minimaxir · 2025-12-29T23:41:16 1767051676

With Opus 4.5 in Claude Code, I'm doing fine with just a (very detailed) CLAUDE.md.

austinbaggio · 2025-12-29T23:52:01 1767052321

Do you find you want to share the .md with the teams you work with? Or is it more for your solo coding?

Myrmornis · 2025-12-30T03:11:47 1767064307

Not saying you were suggesting it but people committing AGENTS.md in shared repos is pretty annoying IMO. Those things are personal.

lukev · 2025-12-30T00:00:25 1767052825

A claude.md file will give you 90% of what you need.

Consider more when you're 50+ hours in and understand what more you want.

AndyNemmity · 2025-12-30T00:11:13 1767053473

In my experience, I'm at the most where I entirely ignore Claude.md - so it's very interesting how many people have very different experiences.

austinbaggio · 2025-12-30T00:58:39 1767056319

It is overwhelming. We have support for Cursor mcp as well, but you lose a lot of the auto-magic stuff you get with the Claude Code plugin. Unfortunately, skills are pretty sticky to the Claude Code stack. It is kind of the vim of AI coding agents. . . One of the goals for this tool was to address context management in a single place. i.e instead of setting up all of the rules, claude.md, and skill.md you just semantic query a specific namespace in your knowledge base.

the docs if you are curious: https://www.ensue-network.ai/docs

pigpop · 2025-12-30T00:01:08 1767052868

You don't need all that, just have Claude write the same documentation you would (should) write for any project. I find it best to record things chronologically and then have Claude do periodic reviews of the docs and update key design documents and roadmap milestones. The best part is you get a written record of everything that you can review when you need to remember when and why something changed. They also come in handy for plan mode since they act as a guide to the existing code.

The PMs were right all along!

einsteinx2 · 2025-12-30T01:39:22 1767058762

I use both Cursor and Claude Code in VS Code at work (so I get similar control as Cursor). I don’t really use Claude Code any differently than cursor. People way over complicate it.

csar · 2025-12-30T03:28:14 1767065294

Claude Code is better out of the box, so all that other stuff is orthogonal or optional. If you eg want to give your agent access to your company’s Notion docs you need a skill.

metadat · 2025-12-29T23:53:33 1767052413

Don't forget about the co-agents.. yeah.

animitronix · 2025-12-30T00:22:06 1767054126

Nope, I spend time learning my tools.