>LLMs using code to answer questions is nothing new, it's why the "how many Rs in strawberry" question doesn't trip them up anymore, because they can write a few lines of Python to answer it, run that, and return the answer.
False. It has nothing to do with tool use but just reasoning.
Oh right you're very focused on specifically the strawberry problem. I just gave that as a throwaway example. It's a solution but not necessarily the solution for something that simple.
My point was much more general, that code execution is a key part of these models ability to perform maths, analysis, and provide precise answers. It's not the only way, but a key way that's very efficient compared to more inference for CoT.
False. It has nothing to do with tool use but just reasoning.