Friday, March 15, 2024

A tweak to the Turing test

The Turing test for machine thought has an interrogator communicate (by typing) with a human and a machine both of which try to convince the interrogator that they are human. The interrogator then guesses which is human. We have good evidence of machine thought, Turing claims, if the machine wins this “imitation game” about as often as the human. (The original formulation has some gender complexity: the human is a woman, and the machine is trying to convince the interrogator that it, too, is a woman. I will ignore this complication.)

Turing thought this test would provide a posteriori evidence that a machine can think. But we have a good a priori argument that a machine can pass the test. Suppose Alice is a typical human, so that in competition with other humans she wins the game about half the time. Suppose that for any finite sequence Sn of n questions and n − 1 answers of reasonable length (i.e., of a length not exceeding how long we allow for the game—say, a couple of hours) ending on a question that could be a transcript of the initial part of an interrogation of Alice, there is a fact of the matter as to what answer Alice would make to the last question. Then there is a possible very large , but finite, machine that has a list of all such possible finite sequences and the answers Alice would make, and that at any point in the interrogation answers just as Alice would. That machine would do as well as Alice at the imitation game, so it would pass the Turing test.

Note that we do not need to know what Alice would say in response to the last question of Sn. The point isn’t that we could build the machine—we obviously couldn’t, just because the memory capacity required would be larger than the size of the universe—but that such a machine is possible. We could suppose constructing the database in the machine at random and just getting amazingly lucky and matching Alice’s dispositions.

The machine would not be thinking. Matching the current stage in the interrogation to the database and just giving the item in the line for that is not thinking. The point is obvious. Suppose that S1 consists of the question “What is the most important thing in life?” and the database gives the rote answer “It is living in such a way that you have no regrets.” It’s obvious that the machine doesn’t know what it’s saying.

Compare this to a giant chess playing machine which encodes for each of the 1040 legal chess positions the optimal next move. That machine doesn’t think about playing chess.

If the Turing test is supposed to be an a posteriori test for the possibility of machine intelligence, I propose a simple tweak: We limit the memory capacity of the machine to be within an order of magnitude of human memory capacity. This avoids cases where the Turing test is passed by rote recitation of responses.

Turing himself imagined that doing well in the imitation game would require less memory capacity than the human brain had, because he thought that only “a very small fraction” of that memory capacity was used for “higher types of thinking”. Specifically, Turing surmised that 109 bits of memory would suffice to do well in the game against “a blind man” (presumably because it would save the computer from having to have a lot of data about what the world looks like). So in practice my modification is one that would not decrease Turing’s own confidence in the passability of his test.

Current estimates of the memory capacity of the brain are of the order of 1015 bits, at the high end of the estimates in Turing’s time (and Turing himself inclined to the low end of the estimates, around 1010). The model size of GPT-4 has not been released, but it appears to be near but a little below the human brain capacity level. So if something with the model size of GPT-4 were to pass the Turing test, it would also pass the modified Turing test.

Technical comment: The above account assumed there was a fact about what answer Alice would make in a dialogue that started with Sn. There are various technical issues with regard to this. Given Molinism or determinism, these technical issues can presumably be overcome (we may need to fix the exact conditions in which Alice is supposed to be undergoing the interrogation). If (as I think) neither Molinism nor determinism is true, things become more complicated. But there are presumably to be statistical regularities as to what Alice is likely to answer to Sn, and the machine’s database could simply encode an answer that was chosen by the machine’s builders at random in accordance with Alice’s statistical propensities.

No comments: