100% this is my main workflow for a few years now when interacting with any LLM. Whenever I see people claim they struggle with using LLM as a part of their workflow, I ask them show me how you are solving a small problem with the AI, and subsequently, they show me this very sub-optimal workflow like GP is describing.
1. Ask a question / present a problem, but usually without enough context to the problem and solution space they want to zero in on.
2. The AI does an honest job given the context, but is off alignment in some specific way that the user did not clarify initially up front.
3. Asks the AI to correct for this, along with some multiple other requests for changes toward the solution they want.
4, 5, 6 Loop. They get a response, like the corrections (sometimes) and continue to make changes, in back-and-forth-conversation like interaction, only copying out corrected code blocks and copying in specific code chunks for correction.
7+. The output gets progressively worse and worse, undoing corrections/changes/modifications that were previously discussed.
At this point I try to interrupt the spiraling death loop and ask the user:
- (rhetorical) why are you talking to the AI like a human being?
- What is in your context window at this point in the conversation?
If they can answer the context window question, AND understand how the AI ingests input and produces output, usually its a lightbulb moment. If they don't quite realize that they are polluting their context window, then I try to get them to be aware that everything in the context window is statistically weighted and will affect the output. If a tainted input is provided, the chances of an untainted output are lower than otherwise. You want to provide high quality context window input, ideally fully control it. That means, you do NOT want to have a conversation with the AI for real work; you need to embrace `zero shotting` everything you ask. This approach maximizes exactly what the AI are best trained for, trained on, how they are trained, and how they `understand` things.
This requires a lot more hand holding and curating prompting, ie prompt engineering, than people will honestly realize/admit to. Prompt engineering isn't black magic, its intelligent contextualization that plays into the strengths of the implicit knowledge AI has. Worst things for a LLM super user?
- copy-paste tedium (doing it by hand)
- RAG auto-compression (letting an algorithm determine critical context decisions)
- opaque context window systems (how is the conversation stored and presented to the LLM each turn?)
- system prompt inaccessibility in certain online providers (system prompt is still super critical for driving)
- general `magic` behavior exhibited when using a plain/simple chat interface (this is usually unraveled ONLY by understanding the full context window)
The only LLM that has been SUCCESSFUL at conversing with me and maintaining state through the flowing conversation has been the newest Gemini 2.5 Pro offering, and ONLY up to 100K out of 1M context window. I have had (very minor) forgetting after 100K, and I deep dove into the conversation at that point to understand what was going on, and it appears that the conceptual conversation compression is in some way, lossy losing some conversation bits.
Every other LLM has had the facade of maintaining conversation state, but only Gemini 2.5 Pro Preview has actually held that up (with firm limitations!). I suspect that large context window optimization/compression is to blame, some providers are aggressive with it.
1. Ask a question / present a problem, but usually without enough context to the problem and solution space they want to zero in on.
2. The AI does an honest job given the context, but is off alignment in some specific way that the user did not clarify initially up front.
3. Asks the AI to correct for this, along with some multiple other requests for changes toward the solution they want.
4, 5, 6 Loop. They get a response, like the corrections (sometimes) and continue to make changes, in back-and-forth-conversation like interaction, only copying out corrected code blocks and copying in specific code chunks for correction.
7+. The output gets progressively worse and worse, undoing corrections/changes/modifications that were previously discussed.
At this point I try to interrupt the spiraling death loop and ask the user:
- (rhetorical) why are you talking to the AI like a human being?
- What is in your context window at this point in the conversation?
If they can answer the context window question, AND understand how the AI ingests input and produces output, usually its a lightbulb moment. If they don't quite realize that they are polluting their context window, then I try to get them to be aware that everything in the context window is statistically weighted and will affect the output. If a tainted input is provided, the chances of an untainted output are lower than otherwise. You want to provide high quality context window input, ideally fully control it. That means, you do NOT want to have a conversation with the AI for real work; you need to embrace `zero shotting` everything you ask. This approach maximizes exactly what the AI are best trained for, trained on, how they are trained, and how they `understand` things.
This requires a lot more hand holding and curating prompting, ie prompt engineering, than people will honestly realize/admit to. Prompt engineering isn't black magic, its intelligent contextualization that plays into the strengths of the implicit knowledge AI has. Worst things for a LLM super user?
- copy-paste tedium (doing it by hand)
- RAG auto-compression (letting an algorithm determine critical context decisions)
- opaque context window systems (how is the conversation stored and presented to the LLM each turn?)
- system prompt inaccessibility in certain online providers (system prompt is still super critical for driving)
- general `magic` behavior exhibited when using a plain/simple chat interface (this is usually unraveled ONLY by understanding the full context window)
The only LLM that has been SUCCESSFUL at conversing with me and maintaining state through the flowing conversation has been the newest Gemini 2.5 Pro offering, and ONLY up to 100K out of 1M context window. I have had (very minor) forgetting after 100K, and I deep dove into the conversation at that point to understand what was going on, and it appears that the conceptual conversation compression is in some way, lossy losing some conversation bits.
Every other LLM has had the facade of maintaining conversation state, but only Gemini 2.5 Pro Preview has actually held that up (with firm limitations!). I suspect that large context window optimization/compression is to blame, some providers are aggressive with it.