New Evidence AI Thinks Like Humans. So Why Can’t It Reason?
Researchers at Apple recently released a paper saying that the current artificially “intelligent” reasoning models can’t reason. This paper was celebrated by some, criticised by others, and discussed so much that no one noticed another paper which appeared at almost the same time, saying that these models think, in some sense, very much like humans. The logical conclusion, it seems to me, is that humans can’t reason, which really explains a lot, doesn’t it.
More seriously, are we maybe expecting too much of AI? Should we just settle on them being as stupid as we are? Today I’ll have a look at those papers, and then try to reason about them.
Let me start with the paper you probably haven’t heard of. The authors looked at how large language models (LLMs) and their visual counterparts classify images. They give both humans and the AI sets of three images and ask them to identify one as the odd one out. By doing this repeatedly, the authors can construct a measure for “similarity” of images.
They find that the models “develop human-like conceptual representations of objects.” Even more amazingly, they also compared the activity patterns in the model network with those in the human brain, and found a “strong alignment between model embeddings and neural activity patterns in brain regions… This provides compelling evidence that the object representations in LLMs, although not identical to human ones, share fundamental similarities that reflect key aspects of human conceptual knowledge.”
So this paper says that the current AIs indeed learn to think like humans, at least in some sense. Let’s talk about the headline making paper from Apple. They looked at what goes on in large reasoning models as problems become more complex.
A large reasoning model is a LLM with a chain-of-thought. This means that the model has extra instructions and extra training to break down any prompt into smaller steps, work them off separately, analyse, and then recombine the results of these steps. Reasoning models aren’t so much new models, as beefed-up versions of the original LLMs. Chains of thoughts can indeed much improve the accuracy of responses, but does this mean that the models actually do something conceptually new?
Keep reading with a 7-day free trial
Subscribe to Science without the gobbledygook to keep reading this post and get 7 days of free access to the full post archives.