All geekdom, and much that is outside our realm, is abuzz with news and discussion of the contest by IBM’s Jeopardy-playing computer, Watson, vs Ken Jennings and Brad Rutter. In last night’s game (the second half of a game started on Monday, Watson won decisively, answering many, many more questions than Jennings and Rutter combined. At the end of the first half, Rutter and Watson were tied. Lots of famous (and formerly famous) artificial intelligence researchers have weighed in on what this means for AI. I want to discuss one small point which highlights the difference between human intelligence in question answering and computerized question answering.
In Monday’s show, one of the Jeopardy “answers” was, “It was the anatomical oddity of U.S. gymnast George Eyser, who won a gold medal on the parallel bars in 1904.” The correct answer was something like, “What is a missing leg?” Apparently, the play sequence went like this: Ken Jennings buzzed in with “What is a missing hand?” Watson then buzzed in with “What is a leg?” This was first deemed correct by Alex Trebek, the host, but this judgment was reversed by the judges on review. I didn’t see the show (a Valentine’s Day dinner took precedent), but apparently the TV show edited out Trebek’s original decision. Because of the reversal, Watson was unable to give a “more specific answer,” and Rutter was unable to buzz in on Watson’s error.
It seems that Trebek was treating Watson’s answer as if a human had given it: If a human had said “What is a leg?” as a follow-up to the wrong question, “What is a missing hand?” it would make sense to treat this as having the same intention as “What is a missing leg?” But Watson doesn’t know what the other contestants are saying, and so it actually had no such context in which to give its answer. I think it is plausible that Trebek would have awarded Watson a correct answer if Watson had given its answer without the context Jennings’s question (or, perhaps, would have asked Watson to give a more specific answer), given the “anatomical oddity” context of the question.
People laughed when Watson gave a similar wrong answer to another of Jennings’s errors. Jennings answered “What are the ’20s?,” and then Watson said, “What is 1920s?” Interestingly, the press reports have pretty much all said that Watson gave the same answer as Jennings. But Watson’s answer, though it would have the same intent if given by a person, is different, both in its raw surface form and in its grammatical incorrectness. I don’t think the Jeopardy rules require grammatically correctness in the questions, for human players don’t have this kind of problem. They didn’t know, or remember in the game play, that Watson didn’t receive the other contestants’ answers.
Watson was penalized for getting the 1920s question wrong, and penalized for getting the leg question right, but in the wrong way. I find it fascinating that people–sometimes in real time, and sometimes only on deliberation–can navigate what it means to have correct and incorrect intentions with respect to human-made artifacts such as Watson, especially within the strong expectations set up by IBM and the Jeopardy game to treat the Watson system as an intentional agent. Most of Watson’s human-seeming qualities come from the expectations that get set up. For example, Watson uses templates such as “Let’s finish up <CATEGORY>, Alex” when there is only one answer left in the category). But the stronger expectations are set by giving it a human name, a synthesized voice, putting it between two humans, using the pronoun “he” when referring to the system, etc. But even given these expectations, people can notice, and react to, the breakdowns (and, in the case of the missing “missing,” a seeming non-breakdown).
Search engines, such as Google and Bing, have a feature often internally called “Answers,” that give an answer to an implied question right on the search results page. For example, enter “molecular weight of hydrogen” or “capital of Michigan” in Bing or Google, and you’ll get the answer without having to go to another page. No one confuses this with human intelligence, yet, at some level of analysis, this is what Watson is doing. Granted, search engines are not optimized for searches phrased as questions (if you ask Google today, “What was George Eyser’s problem?,” the caption will say “he ran out of the paper dishes on which to serve the ice…”; in this case, Bing does a better job, but it could have gone either way). But the extraction of facts and references from a very large number of documents and data sources, based on vague and ambiguous search queries is the essential job of search engines. For the most part, Watson is a search engine, specialized for Jeopardy, that has been given a human moniker and a few human mannerisms.