Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If I’m coordinating a large codebase, I expect the people I’m coordinating to be capable of learning and improving over time. Coding agents cannot (currently) do this.

I wonder if a very lightweight RL loop built around the user could work well enough to help the situation. As I understand it, current LLMs generally do not learn at a rate such that one single bad RL example and one (prompted?) better example could result in improvement at anywhere near human speed.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: