Yesterday I discussed, briefly, how we judge the effectiveness of evidence towards proving or disproving a hypothesis. Let us now apply those standards to a specific piece of evidence, the Turing test. Many people think that the passing the Turing test would be good evidence that a program is conscious. Instead of arguing the merits of behavior as a measure of consciousness, which is usually how the Turing test is thought of, we can use our knowledge of probability to give us a more precise idea of how reliable it is.
Before we get started however there are a few preliminaries that I need to get out of the way. For starters let me note that an effective test for consciousness needs to perform two functions well. First, given a program that passes the test we must be able to have a high degree of confidence that it conscious. Secondly, and of equal importance, is that given a program which fails the test we must have a high degree of confidence that the program is not conscious. If our test doesn’t meet both requirements it indicates that our test has standards for consciousness that are too high (many conscious programs fail the test) or too low (many program that are not conscious pass the test). The reason that I mention this is because in general a good test for anything divides the possible candidates into two groups, those with the quality and those without, which is what the performing the two functions above well would guarantee. A second point I need to mention is that I hold consciousness to be largely independent of intelligence (see here), a fact I will make use of extensively below.
Let us then first examine how likely a program is to be conscious given that it passes the test ( Pr( C | P-T ) ). From the definition of conditional probability we know that Pr( C | P-T ) is equal to Pr ( P-T & C ) / Pr( P-T ). These numbers are awfully small, too small to be able to intuit the resulting probability directly, but there is a way to guess roughly what it must be. Specifically the closer the number of programs that can both pass the test and are conscious approaches the number of programs that can pass the test the closer the probability of Pr( C | P-T ) will be to one (because if number of program that are both conscious and can pass the test approaches the number that can pass the test Pr ( P-T & C ) will approach Pr( P-T ) ). I would argue that the number of possible programs that can pass the test but aren’t conscious is fairly high (for example programs that are designed with sufficient information about human psychology that they can produce the sentences most likely to deceive us, ect), leading me to conclude that Pr( C | P-T) is significantly less than one, probably between .6 and .9, which means the Turing test is decent as evidence that a program is conscious, but far from perfect.
The more interesting case is when we examine how well failing the Turing test disproves the hypothesis that a program is conscious ( Pr( ~C | ~P-T) ). Just working from the assumption that many conscious programs couldn’t pass the test (specifically the ones with low intelligence or ones that are unusually honest) we can see that this probability will be rather low. This means that even if a program fails the Turing test there is little additional reason to believe that it lacks consciousness, and thus that we should rely on other methods to determine if it is or isn’t conscious.
So what do these findings tell us about the Turing test? Well given that it passes only half the requirements (sort of) I would say that it isn’t a very good test of consciousness. It is, however, a good test of human-like intelligence, I will give it that. This of course leaves us in need of a better test of consciousness, one in which no conscious system would be marked as non-conscious, and where no non-conscious system would be mistakenly thought to be conscious. Feel free to leave a suggestion.