I find it more helpful to keep in mind the autoregressive nature rather than the prediction. 'Text prediction' brings to mind that it is guessing what word would follow another word, but it is doing so much more than that. 'Autoregressive' brings to mind that it is using its previously generated output to create new output every time it comes up with another token. In that case you immediately understand that it would have to make the determination of severity after it has generated the description of the issue.