essayMy notes on Eugene Goostman chatbot claiming to pass the Turing test

by Ray Kurzweil
June 1, 2014

Eugene Goostman chatbot. (credit: Vladimir Veselov and Eugene Demchenko)

On June 8, 2014, The University of Reading announced that a computer program had passed the Turing test for the first time.

University of Reading professor Kevin Warwick PhD said:

“Some will claim that the test has already been passed. The words ‘Turing test’ have been applied to similar competitions around the world. However, this event involved more simultaneous comparison tests than ever before, was independently verified and, crucially, the conversations were unrestricted. A true Turing test does not set the questions or topics prior to the conversations. We are therefore proud to declare that Alan Turing’s test was passed for the first time on Saturday.” — Kevin Warwick, PhD

I have had a long-term wager with Mitch Kapor in which I predicted that a computer program would pass the Turing test by 2029 and he predicted that this would not happen, see links below.

This was the first long-term wager on the “Long Now” website. The bet called for $20,000 to be donated from us to the charity of the winner’s choice.

As a result, messages have been streaming in from around the world congratulating me for having won the bet.

However, I think this is premature. I am disappointed that Professor Warwick, with whom I agree on many things, would make this statement. There are several problems that I describe below, including a transcript of a conversation that I had with Eugene Goostman, the chatbot in question.

In my 2004 book The Singularity Is Near, I anticipated that there would be premature announcements of this kind:

The Singularity Is Near, page 295 | Turing was carefully imprecise in setting the rules for his test, and significant literature has been devoted to the subtleties of establishing the exact procedures for determining how to assess when the Turing test has been passed. In 2002 I negotiated the rules for a Turing test wager with Mitch Kapor on the Long Now website.

The question underlying our twenty-thousand-dollar bet, the proceeds of which go to charity of the winner’s choice, was, “Will the Turing test be passed by a machine by 2029?” I said yes, and Kapor said no. It took us months of dialogue to arrive at the intricate rules to implement our wager.

Simply defining “machine” and “human,” for example, was not a straightforward matter. Is the human judge allowed to have any nonbiological thinking processes in his or her brain? Conversely, can the machine have any biological aspects?

Because the definition of the Turing test will vary from person to person, Turing test capable machines will not arrive on a single day, and there will be a period during which we will hear claims that machines have passed the threshold. Invariably, these early claims will be debunked by knowledgeable observers, probably including myself. By the time there is a broad consensus that the Turing test has been passed, the actual threshold will have long since been achieved.”

Professor Warwick claims that the test was “unrestricted.” However, having the chatbot claim to be a 13-year-old child, and one for whom English is not a first language, is effectively a restriction. Moreover, the interactions were reportedly limited to five minutes each. There is a high likelihood of fooling naïve judges in such a brief period of time.

I chatted with the chatbot Eugene Goostman, and was not impressed. Eugene does not keep track of the conversation, repeats himself word for word, and often responds with typical chatbot non sequiturs. Here is the transcript.

Ray Kurzweil conversation with chatbot Eugene Goostman

Question: How old are you?
Eugene: I’m a little boy. 13 years old.

Question: How much is two plus two?
Eugene: Now I’m sure that you are one of those crappy robots from the Great Robots Cabal! And I know your password: 2 + 2=4.0! Now divide by zero and die!!!!

Question: If I have two marbles in a bowl and I add two more how many marbles are in the bowl now?
Eugene: Not too many. But I can’t tell you the exact number, I forgot it. If I’m not mistaken – you still didn’t tell me where you live. OR it’s a secret?:-)

Question: I live in the capital of the United States. Do you know where that is?
Eugene: Every nerd knows that the capital of The United States is Washington.

Question I live in the capital of the country that put a man on the moon.
Eugene: Well, if you want it, the capital of The United States is Washington.

Question: I live in the capital of the country that built the great wall.
Eugene: Tell me more about Capital. I like to learn about different places!

Question: How old were you when you started to wear glasses?
Eugene: No I was not! How can you think so?! If I’m not mistaken – you still didn’t tell me where you live. OR it’s a secret?:-)

Question: I did tell you where I live. I told you several times. Do you remember?
Eugene: Oh! Poor confused cyborg you are! Those malicious robots erased your memory! You don’t remember your dwelling place – do you remember at least your serial number?

Question: I remember. You are the one who did not remember.
Eugene: Am I really the one who did not remember? But well, no time for arguing now! Could you enlighten me with any other hidden knowledge about myself? If I’m not mistaken — you still didn’t tell me where you live. OR it’s a secret?

Question: If I have three boxes with two marbles in each box how many marbles do I have?
Eugene: Not too many. But I can’t tell you the exact number, I forgot it. Some another topic?

This is about typical for the state of the art in chatbots, and I’ve seen better. We have a live chatbot named Ramona on the KurzweilAI website. Ramona can be amusing but we don’t claim that she passes the Turing test.

Ramona live chatbot on KurzweilAI website | “Chat with Ramona 4.2

A large part of the problem is that in his famous 1950 paper, Alan Turing did not specify the rules. Here are the rules that Mitch Kapor and I painstakingly devised together, along with our essays on why we think each of us will win the wager.

The rules I devised with Mitch Kapor | “A Wager on the Turing test: the rules”

An explanation of rules behind the Turing test, used to determine the winner of a long bet between Ray Kurzweil and Mitch Kapor over whether artificial intelligence will be achieved by 2029.

Essay by Ray Kurzweil | “A Wager on the Turing test: Why I think I will win”

Will Ray Kurzweil’s predictions come true? He’s putting his money on it. Here’s why he thinks he will win a bet on the future of artificial intelligence. The wager: an artifical intelligence that passes the Turing test by 2029.

Essay by Mitch Kapor | “Why I think I will win”

Will a computer pass the Turing Test (convincingly impersonate a human) by 2029? Mitchell Kapor has bet Ray Kurzweil that a computer can’t because it lacks understanding of subtle human experiences and emotions.

Essay by Ray Kurzweil | “Response to Mitchell Kapor’s essay titled ‘Why I think I will win’”

Ray Kurzweil responds to Mitch Kapor’s arguments against the possibility that an AI will succeed, in this final counterpoint on the bet: an artificial intelligence will pass a Turing Test by 2029.

Apparently, we have now entered the era of premature announcements of a computer having passed Turing’s eponymous test. I continue to believe that with the right rules, this test is the right assessment of human-level intelligence in a machine.

In my 1989 book The Age of Intelligent Machines, I predicted that the milestone of a computer passing the Turing test would occur in the first half of the 21st century. I specified the 2029 date in my 1999 book The Age of Spiritual Machines. After that book was published, we had a conference at Stanford University and the consensus of AI experts at that time was that it would happen in hundreds of years, if ever.

In 2006 we had a conference called “AI at 50” at Dartmouth College, celebrating the 50th anniversary of the 1956 Dartmouth conference that gave artificial intelligence its name. We had instant polling devices and the consensus at that time, among AI experts, was 25 to 50 years. Today, my prediction appears to be median view. So, I am gratified that a growing group of people now think that I am being too conservative.

related reading:

news | “Chatbot Eugene Goostman passes Turing test, Warwick claims”

University of Reading | “Turing test success marks milestone in computing history”
Dartmouth College | Dartmouth Artificial Intelligence Conference: The Next Fifty Years — AI at 50

Eugene Goostman chatbot | main
Eugene Goostman chatbot | Google Play app

Wikipedia | Turing test
Wikipedia | Eugene Goostman
Wikipedia | Alan Turing
Wikipedia | Mitch Kapor

The New Yorker | “Why can’t my computer understand me?” by Gary Marcus, PhD
The New Yorker | “What comes after the Turing Test” by Gary Marcus, PhD

excerpt | Allow me to propose a Turing Test for the twenty first century: build a computer program that can watch any arbitrary TV program or YouTube video and answer questions about its content — “Why did Russia invade Crimea?” or “Why did Walter White consider taking a hit out on Jessie?” Chatterbots like Goostman can hold a short conversation about TV, but only by bluffing. When asked what Cheers was about, it responded, “How should I know, I haven’t watched the show.” But no existing program — not Watson, Goostman, nor Siri — can currently come close to doing what any teenager can.

event | AAAI 2015
event | AAAI 2015: Beyond the Turing Test Workshop

AAAI 2015: Beyond the Turing Test Workshop | This workshop is modeled on a set of early meetings that shapes the annual RoboCup competitions. Our goal is to craft a replacement, an annual or bi-annual Turing Championship, that might consist of 3-5 different challenging tasks, with bragging rights given to the first programs to achieve human level performance in each task.

We envision the definition of at least two events. The first, recently sponsored by Nuance, will be the Winograd Schema Challenge, proposed by Hector Levesque, which tests the ability of machines to resolve linguistic antecedents in contexts in which common sense knowledge is critical.

The second, suggested by workshop co-chair, Gary Marcus, PhD, will focus on the comprehension of novel materials, such as videos, texts, photos, and podcasts. Marcus suggested programs be asked to watch any arbitrary TV program or YouTube video and answer questions about its content — “Why did Russia invade Crimea?” Several leading researchers, including Guruduth Banavar, Ned Block, Ernest Davis, Oren Etzioni, Ken Forbus, Hiroaki Kitano, Danica Kragic, Leora Morgenstein, Charles Ortiz, Stuart Shieber, Moshe Vardi, and Patrick Winston have agreed to be in the advisory board of this initiative.

Long Now Foundation | “Long Bet: By 2029 no computer or machine intelligence will have passed the Turing Test”
Long Now Foundation | main

Mitchell Kapor | main

related viewing from Newsy: