Analogical reasoning, the unique ability that humans possess to solve unfamiliar problems by drawing parallels with known problems, has long been regarded as a distinctive human cognitive function. However, a groundbreaking study conducted by UCLA psychologists presents compelling findings that might push us to rethink this.
GPT-3: Matching Up to Human Intellect?
The UCLA research found that GPT-3, an AI language model developed by OpenAI, demonstrates reasoning capabilities almost on par with college undergraduates, especially when tasked with solving problems akin to those seen in intelligence tests and standardized exams like the SAT. This revelation, published in the journal Nature Human Behaviour, raises an intriguing question: Does GPT-3 emulate human reasoning due to its extensive language training dataset, or is it tapping into an entirely novel cognitive process?
The exact workings of GPT-3 remain concealed by OpenAI, leaving the researchers at UCLA inquisitive about the mechanism behind its analogical reasoning skills. Despite GPT-3's laudable performance on certain reasoning tasks, the tool isn’t without its flaws. Taylor Webb, the study's primary author and a postdoctoral researcher at UCLA, noted, “While our findings are impressive, it's essential to stress that this system has significant constraints. GPT-3 can perform analogical reasoning, but it struggles with tasks trivial for humans, such as utilizing tools for a physical task.”
GPT-3's capabilities were put to the test using problems inspired by Raven’s Progressive Matrices – a test involving intricate shape sequences. By converting images to a text format GPT-3 could decipher, Webb ensured these were entirely new challenges for the AI. When compared to 40 UCLA undergraduates, not only did GPT-3 match human performance, but it also mirrored the mistakes humans made. The AI model accurately solved 80% of the problems, exceeding the average human score yet falling within the top human performers' range.
The team further probed GPT-3’s prowess using unpublished SAT analogy questions, with the AI outperforming the human average. However, it faltered slightly when attempting to draw analogies from short stories, although the newer GPT-4 model showed improved results.
Bridging the AI-Human Cognition Divide
UCLA's researchers aren't stopping at mere comparisons. They've embarked on developing a computer model inspired by human cognition, constantly juxtaposing its abilities with commercial AI models. Keith Holyoak, a UCLA psychology professor and co-author, remarked, “Our psychological AI model outshined others in analogy problems until GPT-3's latest upgrade, which displayed superior or equivalent capabilities.”
However, the team identified certain areas where GPT-3 lagged, especially in tasks requiring comprehension of physical space. In challenges involving tool usage, GPT-3's solutions were markedly off the mark.
Hongjing Lu, the study’s senior author, expressed amazement at the leaps in technology over the past two years, particularly in AI's capability to reason. But, whether these models genuinely “think” like humans or simply mimic human thought is still up for debate. The quest for insights into AI's cognitive processes necessitates access to the AI models' backend, a leap that could shape AI's future trajectory.
Echoing the sentiment, Webb concludes, “Access to GPT models' backend would immensely benefit AI and cognitive researchers. Currently, we're limited to inputs and outputs, and it lacks the decisive depth we aspire for.”
- Lior Hakim, Co-founder & CTO of Hour One – Interview Series
- The Smart Enterprise: Making Generative AI Enterprise-Ready
- Flick Review: The Best Instagram Hashtag Tool to Boost Reach
- U.S. Imposes Export Restrictions on NVIDIA Chips to Certain Middle East Countries
- Tanguy Chau, Co-Founder & CEO of Paxton AI – Interview Series