To me, it's weird to call it "PhD-level". That, to me, means to be able to take in existing information on a certain very niche area and able to "push the boundary". I might be wrong but to date I've never seen any LLM invent "new science", that makes PhD, really PhD. It also seems very confusing to me that many sources mention "stone age" and "PhD-level" in the same article. Which one is it?
People seem to overcomplicate what LLM's are capable of, but at their core they are just really good word parsers.
Most of the phd’s I know are studying things that I guarantee GPT-5 doesn’t know about… because they’re researching novel stuff.
Also, LLMs don’t have much consistency with how well they’re able to apply the knowledge that they supposedly have. Hence the “lots of almost correct code” stereotype that’s been going around.
I was using the fancy new Claude model yesterday to debug some fast-check tests (quickcheck-inspired typescript lib). Claude could absolutely not wrap its head around the shrinking behavior, which rendered it useless for debugging
People seem to overcomplicate what LLM's are capable of, but at their core they are just really good word parsers.