When machines learn like humans

Probabilistic programs pass the “visual Turing test”
December 10, 2015

Humans and machines were given an image of a novel character (top) and asked to produce new versions. A machine generated the nine-character grid on the left (credit: Jose-Luis Olivares/MIT — figures courtesy of the researchers)

A team of scientists has developed an algorithm that captures human learning abilities, enabling computers to recognize and draw simple visual concepts that are mostly indistinguishable from those created by humans.

The work by researchers at MIT, New York University, and the University of Toronto, which appears in the latest issue of the journal Science, marks a significant advance in the field — one that dramatically shortens the time it takes computers to “learn” new concepts and broadens their application to more creative tasks, according to the researchers.

“Our results show that by reverse-engineering how people think about a problem, we can develop better algorithms,” explains Brenden Lake, a Moore-Sloan Data Science Fellow at New York University and the paper’s lead author. “Moreover, this work points to promising methods to narrow the gap for other machine-learning tasks.”

The paper’s other authors are Ruslan Salakhutdinov, an assistant professor of Computer Science at the University of Toronto, and Joshua Tenenbaum, a professor at MIT in the Department of Brain and Cognitive Sciences and the Center for Brains, Minds and Machines.

When humans are exposed to a new concept — such as new piece of kitchen equipment, a new dance move, or a new letter in an unfamiliar alphabet — they often need only a few examples to understand its make-up and recognize new instances. But machines typically need to be given hundreds or thousands of examples to perform with similar accuracy.

“It has been very difficult to build machines that require as little data as humans when learning a new concept,” observes Salakhutdinov. “Replicating these abilities is an exciting area of research connecting machine learning, statistics, computer vision, and cognitive science.”

Salakhutdinov helped to launch recent interest in learning with “deep neural networks,” in a paper published in Science almost 10 years ago with his doctoral advisor Geoffrey Hinton. Their algorithm learned the structure of 10 handwritten character concepts — the digits 0-9 — from 6,000 examples each, or a total of 60,000 training examples.

Bayesian Program Learning

Simple visual concepts for comparing human and machine learning. 525 (out of 1623) character concepts, shown with one example each. (credit: Brenden M. Lake et al./Science)

In the work appearing in Science this week, the researchers sought to shorten the learning process and make it more akin to the way humans acquire and apply new knowledge: learning from a small number of examples and performing a range of tasks, such as generating new examples of a concept or generating whole new concepts.

To do so, they developed a “Bayesian Program Learning” (BPL) framework, where concepts are represented as simple computer programs. For instance, the form of the letter “A” is represented by computer code that generates examples of that letter when the code is run. Yet no programmer is required during the learning process. Also, these probabilistic programs produce different outputs at each execution. This allows them to capture the way instances of a concept vary, such as the differences between how different people draw the letter “A.”

This differs from standard pattern-recognition algorithms, which represent concepts as configurations of pixels or collections of features. The BPL approach learns “generative models” of processes in the world, making learning a matter of “model building” or “explaining” the data provided to the algorithm.

The researchers “explained” to the system that characters in human writing systems consist of strokes (lines demarcated by the lifting of the pen) and substrokes, demarcated by points at which the pen’s velocity is zero. With that simple information, the system then analyzed hundreds of motion-capture recordings of humans drawing characters in several different writing systems, learning statistics on the relationships between consecutive strokes and substrokes, as well as on the variation tolerated in the execution of a single stroke.

That means that the system learned the concept of a character and what to ignore (minor variations) in any specific instance.

The BPL model also “learns to learn” by using knowledge from previous concepts to speed learning on new concepts — for example, using knowledge of the Latin alphabet to learn letters in the Greek alphabet.

Cipher for Futurama Alien Language 1 (credit: The Infosphere, the Futurama Wiki)

The authors applied their model to more than 1,600 types of handwritten characters in 50 of the world’s writing systems, including Sanskrit and Tibetan — and even some invented characters such as those from the television series “Futurama.”

Visual Turing tests

In addition to testing the algorithm’s ability to recognize new instances of a concept, the authors asked both humans and computers to reproduce a series of handwritten characters after being shown a single example of each character, or in some cases, to create new characters in the style of those it had been shown. The scientists then compared the outputs from both humans and machines through “visual Turing tests.” Here, human judges were given paired examples of both the human and machine output, along with the original prompt, and asked to identify which of the symbols were produced by the computer.

While judges’ correct responses varied across characters, for each visual Turing test, fewer than 25 percent of judges performed significantly better than chance in assessing whether a machine or a human produced a given set of symbols.

“Before they get to kindergarten, children learn to recognize new concepts from just a single example, and can even imagine new examples they haven’t seen,” notes Tenenbaum. “I’ve wanted to build models of these remarkable abilities since my own doctoral work in the late nineties.

“We are still far from building machines as smart as a human child, but this is the first time we have had a machine able to learn and use a large class of real-world concepts — even simple visual concepts such as handwritten characters — in ways that are hard to tell apart from humans.”

Beyond deep-learning methods

The researchers argue that their system captures something of the elasticity of human concepts, which often have fuzzy boundaries but still seem to delimit coherent categories. It also mimics the human ability to learn new concepts from few examples.

It thus offers hope, they say, that the type of computational structure it’s built on, called a probabilistic program, could help model human acquisition of more sophisticated concepts as well.

“I feel that this is a major contribution to science, of general interest to artificial intelligence, cognitive science, and machine learning,” says Zoubin Ghahramani, a professor of information engineering at the University of Cambridge. “Given the major successes of deep learning, the paper also provides a very sobering view of the limitations of such deep-learning methods — which are very data-hungry and perform poorly on the tasks in this paper — and an important alternative avenue for achieving human-level machine learning.”

The work was supported by grants from the National Science Foundation to MIT’s Center for Brains, Minds and Machines, the Army Research Office, the Office of Naval Research, and the Moore-Sloan Data Science Environment at New York University.

Brenden Lake | NYU fellow Brenden Lake on human-level concept learning

Abstract of Human-level concept learning through probabilistic program induction

People learning new concepts can often generalize successfully from just a single example, yet machine learning algorithms typically require tens or hundreds of examples to perform with similar accuracy. People can also use learned concepts in richer ways than conventional algorithms—for action, imagination, and explanation. We present a computational model that captures these human learning abilities for a large class of simple visual concepts: handwritten characters from the world’s alphabets. The model represents concepts as simple programs that best explain observed examples under a Bayesian criterion. On a challenging one-shot classification task, the model achieves human-level performance while outperforming recent deep learning approaches. We also present several “visual Turing tests” probing the model’s creative generalization abilities, which in many cases are indistinguishable from human behavior.