AI system performs better than 75 percent of American adults on standard visual intelligence test

Could shrink the gap between computer and human cognition
January 20, 2017

An example question from the Raven’s Progressive Matrices standardized fluid-intelligence test.* (credit: Ken Forbus)

A Northwestern University team has developed a new visual problem-solving computational model that performs in the 75th percentile for American adults on a standard intelligence test.

The research is an important step toward making artificial-intelligence systems that see and understand the world as humans do, says Northwestern Engineering’s Ken Forbus, Walter P. Murphy Professor of Electrical Engineering and Computer Science at Northwestern’s McCormick School of Engineering. The research was published online in January 2017 in the journal Psychological Review.

The new computational model** is built on CogSketch, an AI platform previously developed in Forbus’ laboratory. It can solve visual problems and understand sketches to give immediate, interactive feedback. CogSketch also incorporates a computational model of analogy, based on Northwestern psychology professor Dedre Gentner’s structure-mapping engine.

The ability to solve complex visual problems is one of the hallmarks of human intelligence. Developing artificial intelligence systems that have this ability provides new evidence for the importance of symbolic representations and analogy in visual reasoning, and it could potentially shrink the gap between computer and human cognition, the researchers suggest.

A nonverbal fluid-intelligence test

The researchers tested the AI system on Raven’s Progressive Matrices, a nonverbal standardized test that measures abstract reasoning.*** All of the test’s problems consist of a matrix with one image missing. The test taker is given six to eight choices for completing the matrix.

“The problems that are hard for people are also hard for the model, providing additional evidence that its operation is capturing some important properties of human cognition,” said Forbus.

“The Raven’s test is the best existing predictor of what psychologists call ‘fluid intelligence,’ or the general ability to think abstractly, reason, identify patterns, solve problems, and discern relationships,” said co-author Andrew Lovett, now a researcher at the U.S. Naval Research Laboratory. “Our results suggest that the ability to flexibly use relational representations, comparing and reinterpreting them, is important for fluid intelligence.”

“Most artificial intelligence research today concerning vision focuses on recognition, or labeling what is in a scene rather than reasoning about it,” Forbus said. “But recognition is only useful if it supports subsequent reasoning. Our research provides an important step toward understanding visual reasoning more broadly.”

* The test taker should choose answer D because the relationships between it and the other elements in the bottom row are most similar to the relationships between the elements of the top rows.

** “The reader may download (Windows) the computational model and run it on example problems,” the authors note in their Psychological Review paper.

** Raven’s Progressive Matrices (RPM) is an intelligence test that “requires that participants compare images in a (usually) 3×3 matrix, identify a pattern across the matrix, and solve for the missing image.” … Designed to measure a subject’s fluid intelligence, “it has remained popular for decades because it is highly successful at predicting a subject’s performance on other ability tests — not just visual tests, but verbal and mathematical as well,” the authors suggest in their Psychological Review paper.


Abstract of Modeling visual problem solving as analogical reasoning

We present a computational model of visual problem solving, designed to solve problems from the Raven’s Progressive Matrices intelligence test. The model builds on the claim that analogical reasoning lies at the heart of visual problem solving, and intelligence more broadly. Images are compared via structure mapping, aligning the common relational structure in 2 images to identify commonalities and differences. These commonalities or differences can themselves be reified and used as the input for future comparisons. When images fail to align, the model dynamically rerepresents them to facilitate the comparison. In our analysis, we find that the model matches adult human performance on the Standard Progressive Matrices test, and that problems which are difficult for the model are also difficult for people. Furthermore, we show that model operations involving abstraction and rerepresentation are particularly difficult for people, suggesting that these operations may be critical for performing visual problem solving, and reasoning more generally, at the highest level. (PsycINFO Database Record (c) 2016 APA, all rights reserved)