Alexander Pruss's Blog: artificial intelligence

Showing posts with label artificial intelligence. Show all posts

Monday, October 6, 2025

Octopuses, aliens, squirrels and AI

I’ve been toying with an argument for dualism along these lines:

Octopuses are conscious.
Technologically advanced aliens are or would be conscious.
Squirrels are conscious.
Current LLMs are not conscious.

Claims 1–3 require a pretty strong multiple realizability. On materialism, our best such multiple realizability is a functionalism. But it is likely that our current LLMs have more sophisticated general intelligence than squirrels. Thus, a functionalism that makes 1–3 true also violates 4.

Dualism, on the other hand, can allow for all of 1–4 by supposing the hypothesis that all and only intellectually sophisticated living things have souls.

Could a physicalist do the same? I think the difficulty is that life is very fuzzy on physicalism, in a way in which consciousness should not be. On dualism, however, we can suppose that God or the laws of nature have a seemingly arbitrary threshold of what life is.

Tuesday, August 26, 2025

My AI policy

I’ve been wondering what to allow and what to disallow in terms of AI. I decided to treat AI as basically persons and I put this in my Metaphysics syllabus:

Even though (I believe) AI is not a person and its products are not “thoughts”, treat AI much like you would a person in writing your papers. I encourage you to have conversations with AIs about the topics of the class. If you get ideas from these conversations, put in a footnote saying you got the idea from an AI, and specifically cite which AI. If you use the AI’s words, put them in quotation marks. (If your whole paper is in quotation marks, it’s not cheating, but you haven’t done the writing yourself and so it’s like a paper not turned in, a zero.) Just as you can ask a friend to help you understand the reading, you can ask an AI to help you understand the reading, and in both cases you should have a footnote acknowledging the help you got. Just as you can ask a friend, or the Writing Center or Microsoft Word to find mistakes in your grammar and spelling, you can ask an AI to do that, and as long as the contribution of the AI is to fix errors in grammar and spelling, you don’t need to cite. But don’t ask an AI to rewrite your paper for you—now you’re cheating as the wording and/or organization is no longer yours, and one of the things I want you to learn in this class is how to write. Besides all this, last time I checked, current AI isn’t good at producing the kind of sharply focused numbered valid arguments I want you to make in the papers—AI produces things that look like valid arguments, but may not be. And they have a distinctive sound to them, so there is a decent chance of getting caught. When in doubt, put in a footnote at the end what help you got, whether from humans or AI, and if the help might be so much that the paper isn’t really yours, pre-clear it with me.

Monday, July 1, 2024

Duplicating electronic consciousnesses

Assume naturalism and suppose that digital electronic systems can be significantly conscious. Suppose Alice is a deterministic significantly conscious digital electronic system. Imagine we duplicated Alice to make another such system, Bob, and fed them both the same inputs. Then there are two conscious beings with qualitatively the same stream of consciousness.

But now let’s add a twist. Suppose that we create a monitoring system that continually checks all of Alice and Bob’s components, and as soon as any corresponding components disagree—are in a different state—then the system pulls the plug on both, thereby resetting all components to state zero. In fact, however, everything works well, and the inputs are always the same, so there is never any deviation between Alice and Bob, and the monitoring system never does anything.

What happens to the consciousnesses? Intuitively, neither Alice nor Bob should be affected by a monitoring system that never actually does anything. But it is not clear that this is the conclusion that specific naturalist theories will yield.

First, consider functionalism. Once the monitoring system is in place, both Alice and Bob change with respect to their dispositional features. All the subsystems of Alice are now incapable of producing any result other than one synchronized to Bob’s subsystems, and vice versa. I think a strong case can be made that on functionalism, Alice and Bob’s subsystems lose their defining functions when the monitoring system is in place, and hence lose consciousness. Therefore, on functionalism, consciousness has an implausible extrinsicness to it. The duplication-plus-monitoring case is some evidence against functionalism.

Second, consider Integrated Information Theory. It is easy to see that the whole system, consisting of Alice, Bob and the monitoring system, has a very low Φ value. Its components can be thought of as just those of Alice and Bob, but with a transition function that sets everything to zero if there is a deviation. We can now split the system into two subsystems: Alice and Bob. Each subsystem’s behavior can be fully predicted from that subsystem’s state plus one additional bit of information that represents whether the other system agrees with it. Because of this, the Φ value of the system is at most 2 bits, and hence the system as a whole has very, very little consciousness.

Moreover, Alice remains significantly conscious: we can think of Alice as having just as much integrated information after the monitoring system is attached as before, but now having one new bit of environmental dependency, so the Φ measure does not change significantly from the monitoring being added. Moreover, because the joint system is not significantly conscious, Integrated Information Theory’s proviso that a system loses consciousness when it comes to be in a part-to-whole relationship with a more conscious system is irrelevant.

Likewise, Bob remains conscious. So far everything seems perfectly intuitive. Adding a monitoring system doesn’t create a new significantly conscious system, and doesn’t destroy the two existing conscious systems. However, here is the kicker. Let X be any subsystem of Alice’s components. Let S_X be the system consisting of the components in X together with all of Bob’s components that don’t correspond to the components in X. In other words, S_X is a mix of Alice’s and Bob’s components. It is easy to see the information theoretic behavior of S_X is exactly the same as the information theoretic behavior of Alice (or of Bob for that matter). Thus, the Φ value of S_X will be the same for all X.

Hence, on Integrated Information Theory, each of the S_X systems will be equally conscious. The number of these systems equals to 2ⁿ where n is the number of components in Alice. Of course, one of these 2ⁿ systems is Alice herself (that’s S_A where A is the set of Alice’s components) and another one is Bob himself (that’s S_∅). Conclusion: By adding a monitoring system to our Alice and Bob pair, we have created a vast number of new equally conscious systems: 2ⁿ − 2 of them!

The ethical consequences are very weird. Suppose that Alice has some large number of components, say 10¹¹ (that’s how many neurons we have). We duplicate Alice to create Bob. We’ve doubled the number of beings with whatever interests Alice had. And then we add a dumb monitoring that pulls the plug given a deviation between them. Suddenly we have created 2^10¹¹ − 2 systems with the same level of consciousness. Suddenly, the moral consideration owed to to the Alice/Bob line of consciousness vastly outnumbers everything.

So both functionalism and Integrated Information Theory have trouble with our duplication story.

Friday, March 15, 2024

A tweak to the Turing test

The Turing test for machine thought has an interrogator communicate (by typing) with a human and a machine both of which try to convince the interrogator that they are human. The interrogator then guesses which is human. We have good evidence of machine thought, Turing claims, if the machine wins this “imitation game” about as often as the human. (The original formulation has some gender complexity: the human is a woman, and the machine is trying to convince the interrogator that it, too, is a woman. I will ignore this complication.)

Turing thought this test would provide a posteriori evidence that a machine can think. But we have a good a priori argument that a machine can pass the test. Suppose Alice is a typical human, so that in competition with other humans she wins the game about half the time. Suppose that for any finite sequence S_n of n questions and n − 1 answers of reasonable length (i.e., of a length not exceeding how long we allow for the game—say, a couple of hours) ending on a question that could be a transcript of the initial part of an interrogation of Alice, there is a fact of the matter as to what answer Alice would make to the last question. Then there is a possible very large , but finite, machine that has a list of all such possible finite sequences and the answers Alice would make, and that at any point in the interrogation answers just as Alice would. That machine would do as well as Alice at the imitation game, so it would pass the Turing test.

Note that we do not need to know what Alice would say in response to the last question of S_n. The point isn’t that we could build the machine—we obviously couldn’t, just because the memory capacity required would be larger than the size of the universe—but that such a machine is possible. We could suppose constructing the database in the machine at random and just getting amazingly lucky and matching Alice’s dispositions.

The machine would not be thinking. Matching the current stage in the interrogation to the database and just giving the item in the line for that is not thinking. The point is obvious. Suppose that S₁ consists of the question “What is the most important thing in life?” and the database gives the rote answer “It is living in such a way that you have no regrets.” It’s obvious that the machine doesn’t know what it’s saying.

Compare this to a giant chess playing machine which encodes for each of the 10⁴⁰ legal chess positions the optimal next move. That machine doesn’t think about playing chess.

If the Turing test is supposed to be an a posteriori test for the possibility of machine intelligence, I propose a simple tweak: We limit the memory capacity of the machine to be within an order of magnitude of human memory capacity. This avoids cases where the Turing test is passed by rote recitation of responses.

Turing himself imagined that doing well in the imitation game would require less memory capacity than the human brain had, because he thought that only “a very small fraction” of that memory capacity was used for “higher types of thinking”. Specifically, Turing surmised that 10⁹ bits of memory would suffice to do well in the game against “a blind man” (presumably because it would save the computer from having to have a lot of data about what the world looks like). So in practice my modification is one that would not decrease Turing’s own confidence in the passability of his test.

Current estimates of the memory capacity of the brain are of the order of 10¹⁵ bits, at the high end of the estimates in Turing’s time (and Turing himself inclined to the low end of the estimates, around 10¹⁰). The model size of GPT-4 has not been released, but it appears to be near but a little below the human brain capacity level. So if something with the model size of GPT-4 were to pass the Turing test, it would also pass the modified Turing test.

Technical comment: The above account assumed there was a fact about what answer Alice would make in a dialogue that started with S_n. There are various technical issues with regard to this. Given Molinism or determinism, these technical issues can presumably be overcome (we may need to fix the exact conditions in which Alice is supposed to be undergoing the interrogation). If (as I think) neither Molinism nor determinism is true, things become more complicated. But there are presumably to be statistical regularities as to what Alice is likely to answer to S_n, and the machine’s database could simply encode an answer that was chosen by the machine’s builders at random in accordance with Alice’s statistical propensities.

Thursday, October 12, 2023

Consciousness and AI

Here are three interesting intuitions (or maybe evan data points) worth chewing on:

Large language models are smarter than squirrels.
Large language models are not conscious.
Squirrels are conscious.

I feel—without having formulate a rigorous argument—that these three data points rather nicely support Ben Page’s thesis that computation-without-consciousness is what we would expect non-theistic evolution to yield.

Monday, June 20, 2022

Life, simulations and AI

An amoeba is alive but an accurate simulation of an amoeba wouldn’t be alive.
If (1), then an accurate simulation of a human wouldn’t be alive.
So, an accurate simulation of a human wouldn’t be alive.
Something that isn’t alive wouldn’t think.
So, an accurate simulation of a human wouldn’t think.
If an accurate simulation of a human wouldn’t think, Strong AI is false.
Strong AI is false.

Behind (2) is the idea that the best explanation of (1) is that computer simulations of living things aren’t alive. I think (4) is perhaps the most controversial of the premises.

Wednesday, May 11, 2022

Chinese Room thought experiments

Thought experiments like Searle’s Chinese Room are supposed to show that understanding and consciousness are not reducible to computation. For if they are, then a bored monolingual English-speaking clerk who moves around pieces of paper with Chinese letters letters—or photographic memories of them in his head—according to a fixed set of rules counts as understanding Chinese and having the consciousness that goes with that.

I used to find this an extremely convincing argument. But I am finding it less so over time. Anybody who thinks that computers could have understanding and consciousness will think that a computer can run two different simultaneous processes of understanding and consciousness sandboxed apart from one another. Neither process will have the understanding and consciousness of what is going on in the other process. And that’s very much what the functionalist should say about the Chinese Room. We have two processes running in the clerk’s head. One process is English-based and the other is a Chinese-based process running in an emulation layer. There is limited communication between the two, and hence understanding and consciousness do not leak between them.

If we accept the possibility of strong Artificial Intelligence, we have two choices of what to say about sandboxed intelligent processes running on the same hardware. We can say that there is one person with two centers of consciousness/understanding or that there are two persons each with one center. On the one person with two mental centers view, we can say that the clerk does understand Chinese and does have the corresponding consciousness, but that understanding is sandboxed away from the English-based processing, and in particular the clerk will not talk about it (much as in the computer case, we could imagine the two processes communicating with a user through different on-screen windows). On the two person view, we would say that the clerk does not understand Chinese, but that a new person comes into existence who does understand Chinese.

I am not saying that the proponent of strong AI is home free. I think both the one-person-two-centers and two-person views have problems. But these are problems that arise purely in the computer case, without any Chinese room kind of stuff going on.

The one-person-two-centers view of multiple intelligent processes running on one piece of hardware gives rise to insoluble questions of the unity of a piece of hardware. (If each process runs on a different processor core, do we count as having one piece of hardware or not? If not, what if they are constantly switching between cores? If yes, what if the separate the cores to separate pieces of silicon that are glued along an edge?) The two-persons view, on the other hand, is incompatible with animalism in our own case. Moreover, it ends up identifying persons with software processes, which leads to the unfortunate conclusion that when the processes are put to sleep, the persons temporarily cease to exist—and hence that we do not exist when sufficiently deeply asleep.

These are real problems, but no additional difficulty comes from the Chinese room case that I can see.

Tuesday, November 17, 2020

Nomic functionalism

Functionalism says that of metaphysical necessity, whenever x has the same functional state as a system y with internal mental state M, then x has M as well.

What exactly counts as an internal mental state is not clear, but it excludes states like thinking about water for which plausibly semantic externalism is true and it includes conscious states like having a pain or seeing blue. I will assume that functional states are so understood that if a system x has functional state S, then a sufficiently good computer simulation of x has S as well.

A weaker view is nomic functionalism according to which for every internal mental state M (at least of a sort that humans have), there is a law of nature that says that everything that has functional state S has internal mental state M.

A typical nomic functionalist admits that it is metaphysically possible to have S without M, but thinks that the laws of nature necessitate M given S.

I am a dualist. As a result, I think functionalism is false. But I still wonder about nomic functionalism, often in connection with this intuition:

Computers can be conscious if and only if functionalism or nomic functionalism is true.

Here’s the quick argument: If functionalism or nomic functionalism is true, then a computer simulation of a conscious thing would be conscious, so computers can be conscious. Conversely, if both computers and humans can be conscious, then the best explanation of this possibility would be given by functionalism or nomic functionalism.

I now think that nomic functionalism is not all that plausible. The reason for this is the intuition that a computer simulation of a cause normally only produces a computer simulation of the effect rather than the effect itself. Let me try to be more rigorous, though.

First, let’s continue from (1):

Dualism is true.
If dualism is true, functionalism is fale.
Nomic functionalism is false.
Therefore, neither functionalism nor nomic functionalism is true. (2–4)
So, computers cannot be conscious. (1, 5)

And that’s really nice: the ethical worries about whether AI research will hurt or enslave inorganic persons disappear.

The premise I am least confident about in the above argument is (4). Nomic functionalism seems like a serious dualist option. However, I now think there is good inductive reason to doubt nomic functionalism.

No known law of nature makes functional states imply non-functional states.
So, no law of nature makes functional states imply non-functional states. (Inductively from 7)
If functionalism is false, mental states are not functional states.
So, mental states are not functional states. (2, 3, 9)
So, no law of nature makes functional states imply mental states. (8 and 10)
So, nomic functionalism is false. (11 and definition)

Regarding (7), if a law of nature made functional states imply non-functional states, that would mean that we have multiple realizability on the left side of the law but lacked multiple realizability on the right side. It would mean that any accurate computer simulation of a system with the given functional state would exhibit the particular non-functional state. This would be like a case where a computer simulation of water being heated were to have to result in actual water boiling.

I think the most promising potential counterexamples to (7) are thermodynamic laws that can be multiply realized. However, I think tht in those cases, the implied states are typically also multiply realizable.

A variant of the above argument replaces “law” with “fundamental law”, and uses the intuition that if dualism is true, then nomic functionalism would have to have fundamental laws that relate functional states to mental states.

Thursday, October 15, 2020

Synchronization and the unity of consciousness

The problem of the unity of consciousness for materialists is what makes activity in different areas of the physical mind come together into a single phenomenally unified state rather than multiple disconnected phenomenal states. If my auditory center active in the perception of a middle C and my visual center is active in the perception of red, what makes it be the case that there is a single entity that both hears a middle C and sees red?

We can imagine a solution to this problem in a computer. Let’s say that one part of the computer has and representation of red in one part (of the right sort for consciousness) and a representation of middle C in another part. We could unify the two by means of a periodic synchronizing clock signal sent to all the parts of the computer. And we could then say that what it is for the computer to perceive red and middle C at the same time is for an electrical signal originating in the same tick of the clock to reach a part that is representing red (in the way needed for consciousness) and to reach a part that is representing middle C.

On this view, there is no separate consciousness of red (say), because the conscious state is constituted not just by the representation of red (say) in the computer’s “visual system”, but by everything that is reached by the signals emanating from the clock tick. And that includes the representation of middle C in the “auditory system”.

The unification of consciousness, then, would be the product of the synchronization system, which of course could be more complex than just a clock signal.

This line of thought shows that in principle the problem of the unity of consciousness is soluble for materialists if the problem of consciousness is (which I doubt). This will, of course, only be a Pyrrhic victory if it turns out that no similar pervasive synchronization system is found in the brain. The neuroscience literature talks of synchronization in the brain. Whether that synchronization is sufficient for solving the unity problem may be an empirical question.

The above line of thought also strongly suggests that if materialism is true, then our internal phenomenal timeline is not the same as objective physical time, but rather is constructed out of the synchronization processes. It need not be the case for this that the representation of red and the representation of middle C happen at the same physical time. A part further from the clock will receive the synchronizing signal later than a part closer to the clock, and so the synchronization process may make two events that are not simultaneous in physical time be simultaneous in computer time. I suspect that a similar divide between mental time and physical time is true even if dualism is (as I think) true, but for other reasons.

Thursday, July 30, 2020

Reproduction and the holiness of God

Necessarily, every finite person is in the image and likeness of God.
We should not make something in the image and likeness of God except when we have good positive reason to think God gave us permission to do so.
The only case in which we have good positive reason to think God gave us permission to make something in the image and likeness of God is through marital intercourse.
So, we should not engage in either in-vitro fertilization or the production of strong Artificial Intelligence.

The philosophically difficult task here would be to analyze the concept of “image and likeness of God”. The main controversial premise in the argument, however, is (2). I think it somehow follows from the holiness of God.

Wednesday, July 22, 2020

In-vitro fertilization and artificial intelligence

Catholics believe that:

The only permissible method of human reproduction is marital intercourse.

Supposing we accept (1), we are led to this interesting question:

Is it permissible for humans to produce non-human persons by means other than marital intercourse?

It seems to me that a positive answer to (2) would fit poorly with (1). First of all, it would be very strange if we could, say, clone Homo neanderthalensis, or produce them by IVF, but not so for Homo sapiens. But perhaps “human” in (1) and (2) is understood broadly enough to include Neanderthals. It still seems that a positive answer to (2) would be implausible given (1). Imagine that there were a separate evolutionary development starting with some ape and leading to an intelligent hominid definitely different from humans, but rather humanlike in behavior. It would be odd to say that we may clone them but can’t clone us.

This suggests to me that if we accept (1), we should probably answer (2) in the negative. Moreover, the best explanation of (1) leads to a negative answer to (2). For the best explanation of (1) is that human beings are something sacred, and sacred things should not be produced without fairly specific divine permission. It is plausible that we have such permission in the case human marital coital reproduction, but we have no evidence of such permission elsewhere. But all persons are sacred (that’s one of the great lessons of personalism). So, absent evidence of specific divine permission, we should assume that it is wrong for us to produce non-human persons by means other than marital intercourse. Moreover, it is dubious that we have been given permission to produce non-human persons by means of marital intercourse. So, we should just assume that:

It is wrong for us to produce non-human persons.

Moreover, if this is wrong, it’s probably pretty seriously wrong. So we also shouldn’t take significant risks of producing non-human persons. This means that unless we are pretty confident that a computer whose behavior was person-like still wouldn’t be a person, we ought to draw a line in our AI research and stop short of the production of computers with person-like behavior.

Do we have grounds for such confidence? I don’t know that we do. Even if dualism is true and even if the souls of persons are directly created by God, maybe God has a general policy of creating a soul whenever matter is arranged in a way that makes it capable of supporting person-like behavior.

But perhaps is reasonable to think that such a divine policy would only extend to living things?

Monday, May 4, 2020

Digital and analog states, consciousness and clock skew

In a computer, we have multiple layers of abstraction. There is an underlying analog hardware level (which itself may be an approximation to a discrete quantum world, for all we know)—all our electronic hardware is, technically, analog hardware. Then there is a digital hardware level which abstracts from the analog hardware level, by counting voltages above a certain threshold as a one, below another—lower—threshold as a zero. And then there are higher layers defined by the software. But it is interesting that there is already semantics present at the digital level: three volts (say) means a one while half a volt (say) means a zero.

At the (single-threaded) software level, we think of the computer as being in a sequence of well-defined discrete states. This sequence unfolds in time. However, it is interesting to note that the time with respect to which this sequence unfolds is not actually real physical time. One reason is this. At the analog hardware level, during state transitions there will be times when the voltage levels are in an area that does not define a digital state. For instance, in 3.3V TTL logic, a voltage below 0.8V is considered a zero, a voltage above 2.0V is considered a one, but in between what we have is “undefined and results in an invalid state”. Since physical changes at the analog hardware level are continuous, whenever there is a change between a zero and a one, there will be a period of physical time at which the voltage is in the “undefined” range.

It seems then that the well-defined software state thus can only occur at a proper subset of the physical times. Between these physical times are physical times at which the digital states, and hence the software states that are abstractions from them, are undefined. This is interesting to think about in connection with the hypothesis of a conscious computer. Would a conscious computer be conscious “all the time” or only during the times when software states are well defined?

But things are more complicated than that. The technical means by which undefined states are dealt with is the system clock, which sends a periodic signal to the various parts of the processor. The system is normally so designed that when the clock signal reaches a component of the processor (say, a flip-flop), that component’s electrical states have a well-defined digital value (i.e., are not in the undefined range). There is thus an official time at which a given component’s digital values are defined. But at the analog hardware level, that official time is slightly different for different components, because of “clock skew”, the physical phenomenon that clock signals reach different components at different times. Thus, when we say that component A is in state 1 and component B is in state 0 at the same time, the “at the same time” is not technically defined by a single physical time, but rather by the (normally) different times at which the same clock signal reaches A and B.

In other words, it may not be technically correct to say that the well-defined software state occurs at a proper subset of the physical times. For the software state is defined by the digital state of multiple components, and the physical times at which these digital state “count” is going to be different for different components because of clock skew. In fact, I assume that the following can and does sometimes happen: component B is designed so that the clock signal reaches it after it has reached component A, and by the time component B is reached by the clock signal, component A has started processing new data and no longer has a well-defined digital state. Thus at least in principle (and I don’t know enough about the engineering to know if this happens in practice) it could be that there is no single physical time at which all the digital states that correspond to a software state are defined.

If this is right, then when we go back to our thought experiment of conscious computer, we should say this: The times of the flow of consciousness in that computer are not even a subset of the physical times. They are, rather, an abstraction, what we might call “software time”. If this is right, the question of whether the computer is presently conscious will be literally nonsense. The computer’s software time, which its consciousness is strung out along, has a rather complex relationship to real time.

So what?

I don’t know exactly. But I think there are a few directions one could take this line of thought:

Consciousness has to be strung out in a well-defined way along real time, and so computers cannot be conscious.
It is likely that similar phenomena occur in our brains, and so either our consciousness is not based on our brains or else it is not strung out along real time. The latter makes the A-theory of time less plausible, because the main motive for the A-theory is to do justice to our experience of temporality. But if our experience of temporality is tied to an abstracted software time rather than real time, then doing justice to our experience of temporality is unlikely to reach the truth about real time. This in turn suggests to me the conditional: If the A-theory of time is true, then some sort of dualism is true.
The problem that transitions between meaningful states (say, the ones and zeros of the digital hardware level) involve non-meaningful states between them is likely to afflict any plausible theory on which our mental functioning supervenes on a physical system. In digital computers, the way a sequence of meaningful states is reconstructed is by means of a clock signal. This leads to an empirical prediction: If the mental supervenes on the physical, then our brains have something analogous to a clock signal. Otherwise, the well-defined unity of our consciousness cannot be saved.

Wednesday, April 1, 2020

If we're not brains, computers can't think

The following argument has occurred to me:

We are not brains.
If we are not brains, our brains do not think.
If our brains do not think, then computers cannot think.
So, computers cannot think.

I don’t have anything new to say about (1) right now: I weigh a lot more than three pounds; my arms are parts of me; I have seen people whose brains I haven’t seen.

Regarding (2), if our brains think and yet we are not brains then we have the too many thinkers problem. Moreover, if brains and humans think, then that epistemically undercuts (1), because then I can’t tell if I’m a brain or a human being.

I want to focus on (3). The best story about how computers could think is a functionalist story on which thinking is the operation of a complex system of functional relationships involving inputs, outputs, and interconnections. But brains are such complex systems. So, on the best story about how computers could think, brains think, too.

Is there some non-arbitrary way to extend the functionalist story to avoid the conclusion that brains think? Here are some options:

Organismic philosophy of mind: Thought is the operation of an organism with the right functional characteristics.
Restrictive ontology: Only existing functional systems think; brains do not exist but organisms do.
Maximalism: Thought is to be attributed to the largest entity containing the relevant functional system.
Inputs and outputs: The functional system that thinks must contain its input and output facilities.

Unfortunately, none of these are a good way to save the idea that computers could think.

Computers aren’t organisms, so (5) does not help.

The only restrictive ontology on the table where organisms exist but brains do not is one on which the only complex objects are organisms, so (6) in practice goes back to (5).

Now consider maximalism. For maximalism to work and not reduce down to the restrictive ontology solution, these two things have to be the case:

Brains exist
Humans are not a part of a greater whole.

Option (b) requires a restrictive ontology which denies the existence of nations, ecosystems, etc. Our best restrictive ontologies either deny the existence of brains or relegate them to a subsidiary status, as non-substantial parts of substances. The latter kind of ontology is going to be very restrictive about substances. On such a restrictive ontology, I doubt computers will count as substances. But they also aren’t going to be non-substantial parts of substances, so they aren’t going to exist at all.

Finally, consider the inputs and outputs option. But brains have inputs and outputs. It seems prejudice to insist that for thought the inputs and outputs have to “reach further into the world” than those of a brain which only reaches the rest of the body. But if we do accept that inputs and outputs must reach further, then we have two problems. The first is that while we are not brains, we could certainly continue to think after the loss of all our senses and muscles. The second is that if our inputs and outputs must reach further into the world, then a hearing-aid is a part of a person which appears false (though recently Hilary Yancey has done a great job defending the possibility of prostheses being body parts in her dissertation here at Baylor).

Thursday, December 13, 2018

Group "belief"

Even though nobody thinks Strong AI has been achieved, we attribute beliefs to computer systems and software:

Microsoft Word thinks that I mistyped that word.
Google knows where I’ve been shopping.

The attribution is communicatively useful and natural, but is not literal.

It seems to me, however, that the difference in kind between the beliefs of computers and the beliefs of persons is no greater than the difference in kind between the beliefs of groups and the beliefs of persons.

Given this, the attribution of beliefs to groups should also not be taken to be literal.

Friday, November 2, 2018

Two kinds of functionalism

There are two kinds of functionalism about the mind.

One kind upholds the thesis that if two systems exhibit the same overall function, i.e., the same overall functional mapping between sequences of system inputs and sequences of system outputs, then they have the same mental states if any. Call this systemic functionalism.

The other kind says that mental properties depend not just on overall system function, but also on the functional properties of the internal states and/or subsystems of the system. Call this subsystemic functionalism. The subsystemic functionalist allows that two systems may have the same overall function, but because the internal architecture (whether software or hardware) that achieve this overall function are different, the mental states of the systems could be different.

Systemic functionalism allows for a greater degree of multiple realizability. If we have subsystemic functionalism, we might meet up with aliens who behave just like we do, but who nonetheless have no mental states or mental states very different from ours, because the algorithms that are used to implement the input-to-output mappings in them are sufficiently different.

If subsystemic functionalism is true, then it seems impossible for us to figure out what functional properties constitute mental states, except via self-experimentation.

For instance, we would want to know whether the functional properties that constitute mental states are neuronal-or-above or subneuronal. If they are neuronal-or-above, then replacing neurons with prostheses that have the same input-to-output mappings will preserve mental states. If they are subneuronal, such replacement will only preserve mental states if the prostheses not only have the same input-to-output mappings, but also are functionally isomorphic at the relevant (and unknown to us) subneuronal level.

But how could we figure out which is the case? Here is the obvious thing to try: Replace neurons with prostheses whose internal architecture does not have much functional resemblance to neurons but which have the same input-to-output mappings. But assuming standard physicalist claims about there not being “swervy” top-down causation (top-down causation that is unpredictable from the microphysical laws), we know ahead of the experiment that the subject will behave exactly as before. Yet if we have rejected systemic functionalism, sameness of behavior does not guarantee sameness of mental states, or any mental states at all. So doing the experiment seems pointless: we already know what we will find (assuming we know there is no swervy top-down causation), and it doesn’t answer our question.

Well, not quite. If I have the experiment done on me, then if I continue to have conscious states after complete neuronal prosthetic replacement, I will know (in a Cartesian way) that I have mental states, and get significant evidence that the relevant system level is neuronal-or-above. But I won’t be able to inform anybody of this. If I tell people: “I am still conscious”, if they have rejected systemic functionalism, they will just say: “Yeah, he/it would say that even if he/it weren’t, because we have preserved the systemic input-to-output mappings.” And there will be significant limits to what even I can know. While I could surely know that I am conscious, I doubt that I would be able to trust my memory to know that my conscious states haven’t changed their qualia.

So with self-experimentation, I could know tht the relevant system level is neuronal-or-above. Could I know even with self-experimentation that the relevant system level is subneuronal. That’s a tough one. At first sight, one might consider this: Replace neurons with prostheses gradually and have me observe whether my conscious experiences start to change. Maybe at some point I stop having smell qualia, because the neurons involved in smell have been replaced with subsystemically functionally non-isomorphic systems. Oddly, though, given the lack of swervy top-down causation, I would still report having smell qualia, and act as if I had them, and maybe even think, albeit mistakenly, that I have them. I am not sure what to make of this possibility. It’s weird indeed.

Moreover, a version of the above argument shows that there is no experiment that we could do that would persons other than at most the subject know whether systemic or subsystemic functionalism is true, assuming there is no swervy top-down causation.

Things become simpler in a way if we adopt systemic functionalism. It becomes easier to know when we have strong AI, when aliens are conscious, whether neural prostheses work or destroy thought, etc. The downside is that systemic functionalism is just behaviorism.

On the other hand, if there is swervy top-down causation, and this causation meshes in the right way with mental functioning, then we are once again in the experimental philosophy of mind business. For then neurons might function differently when in a living brain than what the microphysical laws predict. And we could put in prostheses that function outside the body just like neurons, and see if those also function in vivo just like neurons. If so, then the relevant functional level is probably neuronal-or-above; if not, it's probably subneuronal.

Monday, June 18, 2018

Might machine learning hurt the machine?

Machine learning has the computer generate parameters for a neural network on the basis of a lot of data. Suppose that we think that computers can be conscious. I wonder if we are in a position, then, to know that any particular training session won’t be unpleasant for the computer. For we don’t really know what biological neural configurations, or transitions between them, constitute pain and other forms of unpleasantness. Maybe in the course of learning, among the vast number of changing network parameters or the updates between them there will be some that will hurt the computer. Perhaps it hurts, for instance, when the value of the loss function is high.

This means that if we think computers can be conscious, we may have ethical reasons to be cautious about artificial intelligence research, not because of the impact on people and other organisms in our ecosystem, but because of the possible impact on the computers themselves. We may need to first solve the problem of what neural states in animals constitute pain, so that we don’t accidentally produce functional isomorphs of these states in computers.

If this line of thought seems absurd, it may be that the intuition of absurdity here is some evidence against the thesis that computers can be conscious (and hence against functionalism).

Tuesday, April 17, 2018

In vitro fertilization and Artificial Intelligence

The Catholic Church teaches that it is wrong for us to intentionally reproduce by any means other than marital intercourse (though things can be done to make marital intercourse more fertile than it otherwise would be). In particular, human in vitro fertilization is wrong.

But there is clearly nothing wrong with our engaging in in vitro fertilization of plants. And I have never heard a Catholic moralist object to the in vitro fertilization of farm animals.

Suppose we met intelligent aliens. Would it be permissible for us to reproduce them in vitro? I think the question hinges on whether what is wrong with in vitro fertilization has to do with the fact that the creature that is reproduced is one of us or has to do with the fact that it is a person. I suspect it has to do with the fact that it is a person, and hence our reproducing non-human persons in vitro would be wrong, too. Otherwise, we would have the absurd situation where we might permissibly reproduce an alien in vitro, and they would permissibly reproduce a human in vitro, and then we would swap babies.

But if what is problematic is our reproducing persons in vitro, then we need to look for a relevant moral principle. I think it may have something to do with the sacredness of persons. When something is sacred, we are not surprised that there are restrictions. Sacred acts are often restricted by agent, location and time. They are something whose significance goes beyond humanity, and hence we do not have the authority to engage in them willy-nilly. It may be that the production of persons is sacred in this way, and hence we need the authority to produce persons. Our nature testifies to us that we have this authority in the context of marital intercourse. We have no data telling us that we are authorized to produce persons in any other way, and without such data we should not do it.

This would have a serious repercussion for artificial intelligence research. If we think there is a significant chance that strong AI might be possible, we should stay away from research that might well produce a software person.

Monday, April 16, 2018

The Repugnant Conclusion and Strong AI

Derek Parfit’s Repugnant Conclusion says that, on standard utilitarian assumptions, if n is sufficiently large, then n lives of some minimal level of flourishing will be better any fixed size society of individuals that greatly flourish.

I’ve been thinking about the interesting things that you can get if you combine the Repugnant Conclusion argument with strong Artificial Intelligence.

Assume utilitarianism first.

Given strong Artificial Intelligence, it should be possible to make a computer system that achieves some minimal level of human-like flourishing. Once that is achieved, economies of scale become possible, and I expect it should be possible to replicate that system a vast number of times, and to do so much more cheaply per copy than the cost of supporting a single human being. Note that the replication can be done both synchronically and diachronically: we should optimize the hardware and software in such a way as to make both lots of instances of the hardware and to run as many flourishing lives per day as possible. Once the program is written, since an exact copy is being run for each instance with the same inputs, we can assure equal happiness for all.

If strong AI is possible, generating such minimally flourishing AI and making a vast number of replicates seems a more promising way to increase utility than fighting disease and poverty among humans. Indeed, it would likely be more efficient to decrease the number of humans to the minimum needed to serve the great number of duplicates. At that point, the morally best thing for humans to do will be to optimize the hardware to allow us to build more computers running the happy-ish software and to run each life in as short an amount of external time as possible, and to work to increase the amount of flourishing in the software.

Now note an interesting difference from the traditional Repugnant Conclusion. It seems not unlikely that if strong AI becomes achieved, we will be able to repeatably, safely and cheaply achieve in software not just the minimal levels of human-like flourishing, but high levels of human-like flourishing, even of forms of flourishing other than the pleasure or desire fulfillment that classical utilitarian theories talk about. We could make a piece of software that quickly and cheaply enjoys the life of a classical music afficionado, enjoying the best examples of human classical music culture, and that has no hankering for anything more. And if compatibilism is true (and it is likely that it is true if strong AI is true), then we could make a piece of software that reliably engages in acts of great moral heroism in its simulated world. We lose a bit of value from the fact that these acts only affect a simulated world, but we gain by being able to ensure that no immoral activity mars the value. If we are not certain of the correct axiology, we could hedge our bets by making a software life that is quite flourishing on any plausible axiology: say one that combines pleasure, desire satisfaction, enjoyment of the arts and virtuous activity. And then just run vast numbers of copies of that life per day.

It is plausible that, unless there is some deep spiritual component to human flourishing (of a sort that is unlikely to be there given the materialism that seems needed for strong AI to be possible), we will not only be able to more efficiently increase the sum good by running lots of copies of a happy life than by improving human life, but we will be able to more efficiently improve on the average good.

But one thing is unchanged. The conclusion is still repugnant. A picture of our highest moral imperative being the servicing of a single computer program run on as many machines as possible repeatedly as quickly possible is repugnant.

A tempting objection is to say that multiple copies of the same life count as just one. That’s easily fixed: a well-controlled amount of algorithmic variation can be introduced into lives.

Observe, too, that the above line of thought is much more practical than the original Repugnant Conclusion. The original Repugnant Conclusion is highly theoretical, in that it is difficult to imagine putting into place the kind of society that is described in it without a significant risk of utility-destroying revolution. But right now rich philanthropists could switch their resources from benefiting the human race to working to develop a happy AI (I hesitate to write this sentence, with a slight fear that someone might actually make that switch—but the likelihood of my blog having such an effect seems small). One might respond to the Repugnant Conclusion that all ethical theories give implausible answers in some hypothetical cases. But the case here is not hypothetical.

We can take the above, just as the original Repugnant Conclusion, to be a reductio ad absurdum against utilitarianism. But it seems to be more than that. Any plausible ethics has to have a consequentialist component, even if pursuit of the consequences is restricted by deontic considerations. So on many competing ethical theories, there will still be a pull to the conclusion, given the vast amount of total value, and the respectable amount of average (and median) value achieved in the repugnant proposal. And one won’t be able to resist the pull by denying the picture of value that underwrites utilitarianism, because as noted above, “deeper” values can be achieved in software, given strong AI.

I can think of three plausible ways out of the strong AI version of the Repugnant Conclusion:

The correct axiology lays great stress on the value of deep differences between lives, deeper than can be reliably and safely achieved through algorithmic variation (if there is too much variation, we risk producing misery).
There is a deontic restriction prohibiting the production of software-based persons, perhaps because it is wrong for us to have such a total influence over the life of another person or because it is wrong for us to produce persons by any process other than natural reproduction.
Strong AI is impossible.

I am inclined to think all three are true. :-)

Tuesday, March 13, 2018

Conscious computers and reliability

Suppose the ACME AI company manufactures an intelligent, conscious and perfectly reliable computer, C₀. (I assume that the computers in this post are mere computers, rather than objects endowed with soul.) But then a clone company manufactures a clone of C₁ out of slightly less reliable components. And another clone company makes a slightly less reliable clone of C₂. And so on. At some point in the cloning sequence, say at C₁₀₀₀₀, we reach a point where the components produce completely random outputs.

Now, imagine that all the devices from C₀ through C₁₀₀₀₀ happen to get the same inputs over a certain day, and that all their components do the same things. In the case of C₁₀₀₀₀, this is astronomically unlikely, as the super-unreliable components of the C₁₀₀₀₀ produce completely random outputs.

Now, C₁₀₀₀₀ is not computing. Its outputs are no more the results of intelligence than the copy of Hamlet typed by the monkeys is the result of intelligent authorship. By the same token, C₁₀₀₀₀ is not conscious on computational theories of consciousness.

On the other hand, C₀’s outputs are the results of intelligence and C₀ is conscious. The same is true for C₁, since if intelligence or consciousness required complete reliability, we wouldn’t be intelligent and conscious. So somewhere in the sequence from C₀ to C₁₀₀₀₀ there must be a transition from intelligence to lack thereof and somewhere (perhaps somewhere else) a transition from consciousness to lack thereof.

Now, intelligence could plausibly be a vague property. But it is not plausible that consciousness is a vague property. So, there must be some precise transition point in reliability needed for computation to yield consciousness, so that a slight decrease in reliability—even when the actual functioning is unchanged (remember that the C_i are all functioning in the same way)—will remove consciousness.

More generally, this means that given functionalism about mind, there must be a dividing line in measures of reliability between cases of consciousness and ones of unconsciousness.

I wonder if this is a problem. I suppose if the dividing line is somehow natural, it’s not a problem. I wonder if a natural dividing line of reliability can in fact be specified, though.

Thursday, February 22, 2018

Yet another life-based argument against thinking machines

Here’s yet another variant on a life-based argument against machine consciousness. All of these arguments depend on related intuitions about life. I am not super convinced by them, but they have some evidential force I think.

Only harm to a living thing can be a great intrinsic evil.
If machines can be conscious, then a harm to a machine can be a great intrinsic evil.
Machines cannot be alive.
So, harm to a machine cannot be a great intrinsic evil. (1 and 3)
So, machines cannot be conscious. (2 and 4)