Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Grok 4.20-beta1 scores above GPT-5.4-high and just behind Opus 4.6 on LMArena for Text https://arena.ai/leaderboard

I guess for coding if you’re not first you’re last, but this is damn impressive considering. It looked like they pulled the coding model from the benchmarks, but it was similar.



Accroding to https://artificialanalysis.ai, it's around Gemini Flash 3, or some of the Chinese open weight models, like GLM 5.

For all the money burned, I am not impressed. Why would I use Mecha Hitler for almost double the cost of Gemini Flash 3?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: