Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Claude 3.7, Grok 3 DeepThink and QwQ-32B Thinking stil get it wrong!

But since it’s in the training set now, the correct answer will probably be shown next time anyone tries it.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: