Grok 4.20-beta1 scores above GPT-5.4-high and just behind Opus 4.6 on LMArena fo... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		thebigspacefuck 66 days ago \| parent \| context \| favorite \| on: Elon Musk pushes out more xAI founders as AI codin... Grok 4.20-beta1 scores above GPT-5.4-high and just behind Opus 4.6 on LMArena for Text https://arena.ai/leaderboard I guess for coding if you’re not first you’re last, but this is damn impressive considering. It looked like they pulled the coding model from the benchmarks, but it was similar.

grim_io 66 days ago [–]

Accroding to https://artificialanalysis.ai, it's around Gemini Flash 3, or some of the Chinese open weight models, like GLM 5.

For all the money burned, I am not impressed. Why would I use Mecha Hitler for almost double the cost of Gemini Flash 3?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact