It has often been claimed, and even shown, that training LLMs on their own outpu...

		lblume 16 days ago \| parent \| context \| favorite \| on: Total monthly number of StackOverflow questions ov... It has often been claimed, and even shown, that training LLMs on their own outputs will degrade the quality over time. I myself find it likely that on well-measurable domains, RLVR improvements will dominate "slop" decreases in capability when training new models.