I got turned off in the first paragraph with the misuse of the term "back pressure". "back pressure" is a term from data engineering to specifically indicate a feedback signal that indicates a service is overloaded and that clients should adapt their behavior.
Backpressure != feedback (the more general term). And in the agentic world, we use the term 'context' to describe information used to help LLMs make decisions, where the context data is not part of the LLM's training data. Then, we have verifiable tasks (what he is really talking about), where RL is used in post-training in a harness environment to use feedback signals to learn about type systems, programming language syntax/semantics, etc.
The term back pressure actually comes from mechanical engineering in the context of steam engines.
It first appeared in a dictionary 160 years ago.
Words are just words. Mathematicians very well understand that words mean nothing, what matters are definitions and the author provides one.
E.g. natural numbers may or may not contain the number 0, but that's irrelevant, because what mathematicians care for are definitions, so they will state that natural numbers are a given a set of positive whole numbers (including or not the number 0) and avoid arguing about labels. You can call them funky numbers or neet numbers, doesn't matter.
Same applies here. Your comment is pointless because the author does provide a definition for back pressure in the context of his blog post and what matters is discussing the concept he labels in the context of LLMs.
We all live in our own various small circles, in which many terms get misused. Isomorphic in front end circle means something completely different than any other use, for example. This is how languages evolve.
I'm not trying to discount any attempt to correct people, especially when it gets confusing (like here, I was also confused honestly), but we could formulate it nicer IMHO.
It is perhaps more generally known in the plumbing sense of pressure causing resistance to the desired direction of flow, but yeah, a poor word choice...at least it isn't AI written though.
Backpressure != feedback (the more general term). And in the agentic world, we use the term 'context' to describe information used to help LLMs make decisions, where the context data is not part of the LLM's training data. Then, we have verifiable tasks (what he is really talking about), where RL is used in post-training in a harness environment to use feedback signals to learn about type systems, programming language syntax/semantics, etc.