Mary the Qualia-Blind Neuroscientist

Nick Alonso

A famous philosophical thought experiment imagines Mary, a neuroscientist, who knows all the physical, biological, and functional facts about the human color vision system. Crucially, Mary has never had a conscious experience of color until some point after she acquires all of this neuroscientific knowledge. This raises a question: when Mary consciously experiences color for the first time, does she learn something new about color experience? The thought experiment is meant to support Dualism, the view that consciousness is non-physical in some way.

The purpose of this blog, however, is not to discuss this well-trod debate about Mary and Dualism. Instead, below, I imagine a variant of the Mary thought experiment that is meant to support Illusionism.

What is Illusionism? ‘Phenomenal realists’ say consciousness has a ‘what-it-is-likeness’, ‘phenomenality’, ‘phenomenal character’, or ‘subjective character’ (henceforth, just phenomenal character). What it is like to experience the color red from the first person point of view, for example, is the phenomenal character of the red experience. Realists say explaining how and why consciousness with phenomenal character arises from the brain is a very hard problem, maybe the hardest in all of science. Illusionists, on the other hand, accept that it surely seems as if this phenomenal character exists but deny it actuality does, and, if there is no phenomenal character, there is no hard problem.

Most find Illusionism to be highly implausible. The argument I present below is meant to make Illusionism more palatable. The argument uses a hypothetical scenario to present a kind of dilemma for phenomenal realists — it seems to force phenomenal realists to accept one of two difficult to defend conclusions. I am still unsure how challenging this dilemma is for Realists, but I am convinced it is non-trivial. In any case, the hypothetical scenario is an interesting one to think through.

I have not found previous work that discusses a scenario exactly like one I describe below, though a few years ago Eric Thomas presented a similar thought experiment. Thomas imagines a non-conscious alien scientist studying cats. His thought experiment begins with phenomenal Realist assumptions and uses the thought experiment to present an argument for Dualism. The thought experiment I present below, on the other hand, imagines a non-conscious scientist studying humans. My argument begins agnostically about the Realist-Illusionist debate and is used to present an argument for Illusionism.

The Non-Conscious Neuroscientist Scenario

Imagine a neuroscientist, Mary, who is non-human. We can, for example, imagine she is an alien or AI. Despite being non-human, Mary is highly intelligent and cognitively similar to humans in most ways except for one: unlike humans, Mary does not have phenomenal experience herself nor does she have any intuition that she has mental states with phenomenal character.

Mary has an intuitive psychology not unlike humans’, which disposes her to represent her own and others’ minds as consisting of states similar to what we call beliefs, desires, perceptions, intentions, etc. However, she understands these states in a way that is independent of any relations to phenomenal character. She may, for example, understand them purely in terms of their functional aspects, i.e, their causal relations to each other and behavior, but Mary has no intuitive sense these mental states have any special phenomenal character.

In addition to an intuitive psychology, Mary has extensively studied her own brain and human brains, so much so that she has a complete knowledge of all the physical-functional facts about her own and human brains. She knows all the facts about their physical-biological properties from the cellular level down to the level of atoms. She knows all of their computational-functional properties, from abstract cognitive levels down to fine-grained neural levels.

This knowledge, of course, includes a complete physical-functional understanding of why humans tend to say they have mental states with a mysterious phenomenal character. (For simplicity, I’ll call verbal reports about one’s own mental states with phenomenal character ‘p-reports‘.) Mary knows all the cognitive-computational mechanisms that dispose humans to make p-reports and she knows how these cognitive processes are implemented in the human brain. Mary has even identified the particular cognitive-computational states and processes that seem to be at the root of the human tendency to generate p-reports, i.e., the main source(s) of human p-reports.

What Should Mary Believe about Consciousness?

The first question to consider is what should Mary believe (emphasis on ‘should’) about the nature of, what people call, phenomenal consciousness? In particular, should she believe people have mental states with phenomenal character or not?

Assume Mary begins agnostically. She initially assigns equal probability to both Realism and Illusionism. She should now only adjust her credences appropriately in response to relevant rational argument and empirical data.

So, what arguments and data might there be for Mary to increase her credence in Realism?

The first point to make is that just because people insist they have experiences with phenomenal character is not evidence that they actually do. People’s p-reports are empirical data that are equally as consistent with Illusionism as they are with Realism. Illusionism, like Realism, predicts people should say they have conscious experience with mysterious phenomenal character and write books about these mysterious properties of consciousness, and so on. In order for p-reports to be evidence against Illusionism, they would have to be more probable under Realism than Illusionism, but they are not.

Consider a related point: Mary has a complete consciousness-neutral, physical-functional description of the causes of p-reports that does not assume anything philosophically mysterious like phenomenal character exists. This description will sufficiently explain why humans tend to generate p-reports in the sense the p-report generating processes it describes will entail that such behaviors should come about, just as any human behavior is entailed by some prior physical-functional state of the brain and body. Therefore, phenomenal character is not needed to provide a sufficient causal explanation of p-reports in this sense, and cannot in this sense provide Mary with reason to believe phenomenal character exists.

The second point to make is that Mary’s knowledge of the human brain does not appear to provide reason for her to increase her credence in Realism. To see this, consider the possibility that the explanatory gap exists. The explanatory gap is the view that physical-functional facts about the brain do not entail anything in particular about phenomenal character. For example, it does not seem that certain physical-functional states of the visual cortex, say, should entail the existence of a color experience with particular phenomenal character rather than some other experience or no experience at all. Many philosophers believe this gap exists.

If we assume this gap exists, from Mary’s point of view, there would be nothing in particular about phenomenal character, including the probability that it exists, that is implied from her complete physical-functional knowledge of the human brain. If the explanatory gap exists, any physical-functional facts about the brain cannot provide Mary with reason to increase her credence in Realism.

Next, consider the case where the explanatory gap does not exist — physical-functional facts about the brain do entail facts about the phenomenal character of experience. In this case, one might think Mary would have reason to favor Realism: if certain physical-functional facts about the brain entail certain conscious states with certain phenomenal character should exist, and Mary understands these entailments, she would realize phenomenal character must exist.

However, it is hard to make sense of this idea. How could Mary even understand/conceptualize phenomenal character in the same way humans do? First, unlike humans, she cannot understand phenomenal character introspectively, since she has no experience herself. Mary, therefore, cannot understand phenomenal character under the common description “what it is like to have some experience from the first-person point of view”. Second, phenomenal character would not be mysterious to her. If certain physical-functional facts entailed that certain conscious states with certain phenomenal character exists, and Mary knows and understands the details of these entailments, she should not get the feeling there is anything mysterious or deeply problematic about phenomenal character. She would not understand phenomenal character as “that aspect of consciousness that is deeply hard to explain in third-person, physical-functional terms” because no aspect of consciousness would be deeply problematic to explain in physical-functional terms from her third-person point of view.

Any concept of consciousness and phenomenal character cognizable by Mary, that she could deduce from her knowledge of the human brain, would seem to be a deflated version of the conception used by human phenomenal realists, in these ways. Instead, Mary’s resulting deflating concept of phenomenal character would seem to perfectly consistent with the concepts of consciousness used by Illusionists, who define consciousness in some physical-functional way without reference to any introspectable, mysterious properties.

Thus, in any case, whether there is or is not an explanatory gap, physical-functional facts about the brain do not seem to provide Mary reason to believe mental states with phenomenal character (in the realist sense) exist.

(Note also there is a third possible scenario, where there is no explanatory gap but Mary does not understand the entailments between physical-functional brain facts and phenomenal character. In this case, Mary would face the same challenge as the case where there is an explanatory gap — she would not know that or how physical-functional facts about the brain entail facts about consciousness with certain phenomenal character and, therefore, these entailments would not provide her with reason to believe any phenomenal character exists.)

So, empirical knowledge about human verbal reports and the physical-functional properties of the human brain do not provide Mary with reason to increase her credence in Realism. If data about human behavior and the brain cannot do so, what can? The remaining possibility is a priori philosophical arguments against Illusionism.

The challenge with this approach is that in order for such arguments to provide reason for Mary to update her beliefs, they cannot assume introspective access to consciousness. For example, unlike people, Mary cannot by swayed by the common argument ‘I know conscious states with a mysterious phenomenal character exists because I observe they exist in my own mind‘ or ‘If I know anything it is that I am having certain experiences right now with certain phenomenal character‘. Mary has no such introspective access to conscious experience with phenomenal character, so such arguments should not persuade Mary. Furthermore, Mary should not be swayed by arguments that appeal to common sense since it is not common sense to Mary (or any other intelligences like Mary) that phenomenal character exists.

What philosophical arguments could phenomenal realists appeal to that do not assume introspective access to phenomenal consciousness? It is unclear at best what these would be. It is far outside the scope of this blog to survey all of the various criticisms of Illusionism here one by one, but I think the main challenge for Realists is that many, if not most, of the prominent criticisms of Illusionism appeal to introspection in some way. If an argument requires an introspective understanding of consciousness, should be ineffective at swaying Mary who has no introspective understanding of phenomenal consciousness.

For example, one approach, taken by Eric Schwitzgebel is to try to define phenomenal consciousness in an “innocent” way that does not explicitly refer to phenomenal character or other mysterious properties. Instead, it just defines consciousness by whatever property is shared by the mental states people call conscious states. By removing the reference to phenomenal character in the definition, Illusionist views that make direct claims about phenomenal character can be avoided.

The issue here is the shared property that Schwitzgebel seems to have in mind is one that is identified upon introspection. Phenomenal consciousness, under his definition, “is the most folk psychologically obvious thing or feature that the positive examples possess and that the negative examples lack”. Such a folk-psychological definition would therefore necessarily not involve knowledge learned by scientifically studying the physical-functional facts about brain. Instead, this folk-psychological definition would rely on feature(s) identified in the intuitive common fashion, i.e., via introspection.

However, under such a definition, Mary cannot understand what humans are referring to by introspecting her own mind, since she does not introspect anything she is disposed to call phenomenal conscious states. Mary’s only way to try to identify what humans are referring to would be to study human judgments/verbal reports about consciousness and develop a scientific theory of what it is they are referring to when introspecting and reporting about phenomenal states. For example, if there is some physical-functional state of the brain (e.g., a source of p-reports) that tracks these judgments well, she could use this to develop a scientific, third-person description of what people are referring to when they claim to be introspecting phenomenal character.

However, Mary’s resulting concept of phenomenal consciousness would seem to be consistent with Illusionism. Whatever physical-functional properties Mary identifies would not be deeply mysterious — they would not resist objective, physical-functional explanation from Mary’s point of view since they are themselves physical-functional properties.

This seems to leave us in the same position as before: whatever Mary can try to identify with what people call phenomenal states would seem to be something that leads to a deflated, non-mysterious concept of consciousness totally consistent with Illusionism. Her understanding of consciousness would not lead her to believe there is a hard problem of consciousness. It would not lead her to think consciousness is any different than any other cognitive process.

While there are other anti-Illusionist arguments, I suspect most will somehow require an introspective understanding of consciousness, through some introspective reference to ‘this’ or ‘that’ aspect of consciousness. Since Mary cannot understand such properties introspectively, she can only try to understand them from the outside by trying to explain the associated human verbal reports by their neural-cognitive causes. She can always develop some physical-functional explanation for the reports, but such an explanation will not describe anything that is mysterious, that resists objective, physical-functional explanation. Whatever understanding/descriptions Mary ends up with will leave her with little reason to believe in human mental states with some mysterious phenomenal character.

Another issue for Realists is that there exists a priori philosophical arguments in favor of Illusionism that do not depend on any introspection at all, such as Chalmers’ evolutionary debunking arguments and simplicity arguments in favor of Illusionism. Debunking arguments argue that there is an evolutionary explanation of why humans tend to make p-reports that does not assume they are true, and thus if they are true there would be some epistemic luck, which is implausible. Therefore, the best conclusion is p-reports are false. Simplicity arguments point out Illusionism is simpler than Realism in that it assumes fewer entities exist than Realism does, and Illusionism does not need to assume anything else exists to causally explain all observable physical-functional facts (due to causal closure of the physical universe). In this sense, Illusionism should be preferred.

Importantly, neither of these arguments make any references to introspection. Chalmers’ debunking arguments only make claims about human behavioral dispositions, evolution, and beliefs. Simplicity arguments only makes a simple point about the number of entities each theory claims exist and the causal closure of physics.

Mary’s cognitive set-up, that of having no phenomenal states, seems to remove the force from many common anti-Illusionist arguments, but it does not seem to remove any force from pro-illusionist arguments. This suggests that a priori philosophical arguments should, at least, not alter Mary’s credences, and possibly even sway Mary’s credences in favor of Illusionism.

Mary’s Dilemma for Realism

Now, we come to the second question: why should we, humans, conclude anything different than Mary?

As far as I can see, Realists have two options:

Option 1: They can disagree with the points made in the last section and argue Mary should not conclude Illusionism is at least as plausible as Realism. Instead, Mary should conclude Realism is more plausible. For the reasons previously mentioned, this argument seems challenging.

Option 2: Realists can argue that while Mary should conclude Illusionism is at least as plausible as Realism, humans should conclude Realism is more plausible than Illusionism. Taking this line of reasoning would accept that Mary’s reasoning is valid, but it would argue Mary lacks certain justification that humans have for believing Realism. A challenging question for those taking this option is the question of what this justification is?

There are a few ways Realists might try to answer this question. A simple answer for the Realist would take a Dualist approach and say Mary lacks knowledge of non-physical properties of consciousness. Mary’s mind has no such properties so she cannot access them introspectively, nor can she gain knowledge of them by studying the physical brain. Human minds to have non-physical properties which they can access upon introspection, and therefore humans have some knowledge/justification for believing phenomenal Realism that Mary lacks. The problem with this Dualist approach, is that just about every scientist and philosopher would want to avoid such views, given all the well-known problems that come with Dualism.

Another possible response from Realists would be to simply stipulate people have some special, certain knowledge of phenomenal character gained through introspection of their conscious states (whether they are physically reducible or not). However, Illusionists, like Dan Dennett and Keith Frankish, have pointed out now for decades that such views lack good explanation. If an individual claims some special introspective knowledge or justification for their belief in phenomenal states, through statements like “I have privileged, certain knowledge of my own experiences and their phenomenal characters”, they have to define and explain the concepts they use, such as what the “I” is, what the “phenomenal character” is, and how the special knowledge relation between the two works.

Some philosophers like Chalmers have tried to make direct responses to challenges like this one from Illusionists. However, they make controversial assumptions. For example, here’s part of Chalmers’ response to related challenge posed by Keith Frankish

“In my view a (non-Russellian) dualist should probably endorse primitive subjects, holding that fundamental phenomenal properties are instantiated by fundamental entities (a kind of entity dualism or substance dualism).” (Chalmers, 2020)

This response falls back on Chalmers’ Dualist assumptions, which most researchers would find highly unpalatable.

What is needed for the Realist is some explanation of introspection that does not use the word “I” and does not assume controversial views like Dualism. One option of this sort would appeal to a mental state with a special relation to consciousness. For example, philosophers often appeal to the idea of ‘phenomenal concepts‘, which are special concepts that are used during introspection to represent and think about consciousness and its phenomenal character. These concepts, the idea goes, have some special direct relation to the consciousness, but the way these concepts work can be explained in physical-functional terms. Phenomenal concepts are typically used to counter dualism by providing an explanation for how an explanatory gap could exist without there being an actual metaphysical gap between the physical-function brain and consciousness: phenomenal concepts represent consciousness in a distinct way from descriptive concepts that represent the physical-functional properties. This creates a psychological gap between two kinds of mental representations of consciousness that leads to the explanatory gap. Thus, there is a physicalist (non-dualist) explanation for how an explanatory gap exists.

In the present context, the proponents of the phenomenal concept strategy could use the idea of phenomenal concepts to say something like Mary lacks consciousness and phenomenal concepts. Phenomenal concepts are usually described as having some direct relation to consciousness, so there would likely be some story here about how this direct relation provides people with justification for believing Realism is more plausible than Illusionism, which Mary, not having phenomenal concepts, would lack.

There are several issues with this approach. First, a challenge for phenomenal concept-like strategies is that it is not enough to argue that it is possible for such phenomenal concepts (with special relations to consciousness that justify Realism) to exist. One would have to argue it is more probable than not that they do exist. Remember, we are asking why humans should assign more credence to Realism than Illusionism, rather than just assigning 50/50. To do this, it is not enough to say there is possibly some special phenomenal concepts that provide humans with justification for assigning more credence to Realism, since the mere possibility of such concepts is consistent with a 50/50 assignment of credences.

As far as I can tell, however, the best that proponents of the phenomenal concept strategy have done is show it is only possible phenomenal concepts exist. There isn’t exactly an empirical research program around these ideas that have provided empirical evidence for their existence along with their special epistemic relations to consciousness.

Second, a key component of the phenomenal concept strategy is that such concepts can be explained in completely physical-functional terms. However, if such concepts are fully understandable in physical-functional terms, then Mary will understand everything about them since she knows all the physical-functional facts about the brain (!), including how and in what way they provide humans with justification for believing Realism. But now there is a contradiction. Remember, we are considering how Realists could defend position 2 — the view the Mary should be pro-Illusionist, but humans should be pro-Realist. However, if Mary has knowledge of the same facts that justify Realism for humans, then she will have access to such justification too. Thus, this phenomenal concept approach does not seem compatible with option 2.

Nor does this strategy seem able to defend option 1, the view Mary should place more credence on Realism. Let’s say Mary learns of the special epistemic properties phenomenal concepts provide humans for believing in Realism. What are these? It is just hard to imagine what this special epistemic properties could be, even if phenomenal concepts exist. Maybe something like phenomenal concepts exists that explains why people tend to have dualist intuitions and intuitions there is an explanatory gap. But, this is distinct from the claim these concepts also provide some special knowledge of phenomenal character. What is needed is reason to believe these concepts have something more that involves special epistemic justification for Realism. But, why think these concepts have some special properties that would justify believing in the hard problem and mysterious things like phenomenal character? It is not obvious. The only way I could see a Realist defending this view would be to beg the question, by assuming the existence of phenomenal character, then proceeding to explain some epistemic relation it has to phenomenal concepts.

Let’s consider one final possible response from Realists. Some Realists might argue it is simply obviously true that human’s have knowledge of phenomenal character and it is so obvious stipulation is sufficient to justify this position. The burden is instead on Illusionists to explain why it is insufficient, just as one might be required to explain why something we take to be basic data or a widely held axiom should not, in fact, be considered so.

But, the Mary scenario does seem to reverse this burden. Remember, option 2 for Realists argues Mary should conclude Illusionism is at least as plausible as Realism, but humans should conclude Realism is more plausible. Does this position not cry out for an explanation? What knowledge/justification do humans have to believe in phenomenal character that Mary lacks and how do humans get it? How can it be that Mary knows so much (she knows all the physical-functional facts about the brain, remember) and reasons in a valid manner, yet she is somehow incorrect in her conclusions, while humans are correct? This requires a coherent explanation from Realists.

Conclusions

In summary, this thought experiment presents a kind of dilemma for Realists. It seems Realists must choose between one of two challenging positions:

  1. Position 1: Mary should favor Realism over Illusionism. The difficulty here is 1) any behavior data Mary can observe is equally consistent with Realism as it is with Illusionism, 2) any brain data Mary can observe provides her with no clear reason to believe phenomenal character exists, and 3) the fact Mary does not introspect phenomenal experience seems to, at best, complicate anti-Illusionist philosophical arguments and, at worst, remove most of their force.
  2. Position 2: Mary is correct in concluding Illusionism is at least as plausible as Realism, but humans should favor Realism due to some special knowledge/justification they have that Mary lacks. The challenge here is in explaining how people acquire this special knowledge/justification that Mary lacks. The claim Mary can both be reasoning correctly yet be wrong begs for an explanation beyond just simple stipulation that people know something she doesn’t, but Realists have yet to develop any consensus around an argument as to why we should believe that humans have some special introspective knowledge/justification for believing in phenomenal character.

If Realists cannot meet these challenges, then it seems we must agree with Mary and accept Illusionism is at least as plausible as Realism. In other words, we must take Illusionism seriously.

If there is some counter-argument, criticism, or anything else you think I have missed, I would love to hear it! Please post in the comments.

Why are Consciousness Researchers So Resistant to Illusionism?

Introduction

Most consciousness researchers are “realists”. They believe consciousness has special “subjective character”, “what-it-is-likeness”, “phenomenal qualities”, etc. and explaining these qualities, i.e., why and how they arise from the neural processing, poses a very hard problem, maybe the hardest in all of science.

Illusionists, on the other hand, do not believe there exists any “subjective character”, “phenomenality”, etc., that needs to be explained, and, as a consequence, there is no hard problem of consciousness. Illusionism has received increased attention by philosophers and scientists over the last decade, but illusionists are still in the minority. Most consciousness researchers do not find illusionism plausible, and many even find it to be absurd.

Why aren’t more consciousness researchers sympathetic to illusionism? This is a question some illusionists are asking themselves. I ask myself the same question, since I, over the last few years, have become increasingly confident illusionists are right. Though I am not a consciousness researcher, I did closely study the philosophy and science of consciousness as an undergraduate and master’s student, and wrote a master’s thesis on the subject.

Those of us confident in the illusionist view must ask ourselves, is the majority opinion of consciousness experts a good truth-tracking statistic, in which case illusionists, like myself, are likely wrong? Or might there be some other reason(s) for the field’s resistance to illusionism, that is independent of illusionism’s plausibility?

Illusionists have argued previously that people naturally have anti-illusionist intuitions (e.g., see this from Francois Kammerer). It is very common for people to find illusionism to be, not only implausible, but absurd, suggesting there is a natural psychological anti-illusionist intuition. Philosophers like Keith Frankish have suggested this may be why illusionists are still a minority among consciousness researchers.

I believe this is at least part of the explanation. However, I think illusionists need more than this. One can still ask the question, even if people do have natural, anti-illusionist intuitions, why are consciousness researchers failing to override these intuitions with reason (i.e., with philosophical argumentation in support of illusionism)? Presumably scientists and philosophers are precisely the one’s who are best trained to follow reason (i.e., rational argument and the scientific method) wherever it goes, even if it leads to unintuitive or surprising conclusions. So why should we really think scientists and philosophers are so influenced by such naive intuitions?

I want to suggest the following hypothesis: if people have anti-illusionist intuitions, then these intuitions are likely to be unusually strong among consciousness researchers relative to other researchers. This may help form an explanation for why it is consciousness researchers are so influenced by their pre-philosophical intuitions. It also motivates more empirical analysis of our natural intuitions among researchers who do not specialize in consciousness research.

The Meta-Problem of Consciousness

Before explaining this view, we need to discuss some terms. The meta-problem is the problem of providing cognitive-neural explanations of why people tend to think there is a hard problem. More specifically, it is the problem of explaining problem intuitions. Problem intuitions are people’s tendencies to judge consciousness has certain mysterious/problematic properties. For example, explanatory intuitions are tendencies to judge there exists an explanatory gap between consciousness and physical-functional descriptions of the brain. Metaphysical intuitions are tendencies to judge consciousness has non-physical properties. Knowledge intuitions are tendencies to judge we have some special first-person knowledge of our own experiences and their subjective character. And so on.

Anti-Illusionist Intuitions: A Psychological Explanation

Several Illusionists (e.g., here) have previously argued that a problem intuition that needs explaining is the common intuition that illusionism is false. It is quite natural for most studying the philosophy and science of consciousness to find illusionism unpalatable, at best, and outright absurd at worst. And this intuition is present prior to hearing any of the philosophical arguments for or against illusionism, i.e., it is a naive, pre-philosophical intuition present upon hearing a description of the illusionist view. This does not mean it is wrong, just that it is not a result of considering philosophical arguments against illusionism. People just naturally feel the view is implausible, even prior to philosophical analysis. Philosophers like Frankish have suggested this intuition may be the reason consciousness researchers are, like most people, so resistant to illusionism: consciousness researchers, like most, will have this intuition and this biases their views and research against illusionism.

The Sampling Bias Hypothesis

I find the idea that people have natural, anti-illusionist intuitions plausible.

However, I believe there is additional a kind of selection bias in the population of consciousness researchers, i.e., consciousness researchers tend to have unusually strong anti-illusionist intuitions.

Here’s the basic argument:

  1. People naturally have anti-illusionist intuitions, which are independent of illusionism’s philosophical merits (i.e., they are pre-philosophical, naive intuitions).
  2. How much these anti-illusionist intuitions actually bias a researcher’s beliefs depends on the strength of the intuition: stronger the intuition the more probable and significant its effects.
  3. Consciousness researchers have unusually strong anti-illusionist intuitions relative to the general population of researchers.
  4. Consciousness researchers have an unusually strong bias against illusionism and this bias is independent of illusionism’s philosophical merit.

Proposition 4 follows from premises 1-3. I think premises 1-3 are quite plausible. Let’s consider each in turn.

Premise 1: People naturally have anti-illusionist intuitions, which are independent of illusionism’s philosophical merits (i.e., they are pre-philosophical, naive intuitions). This should not be controversial. First off, by ‘pre-philosophical’ anti-illusionist intuitions I just mean intuitions that exist prior to hearing the arguments against illusionism. I think it is fairly uncontroversial such intuitions exist given that is quite common for scientists and philosophers to immediately find illusionism absurd. Common arguments against illusionism often just begin with its general description and conclude it is just obvious such a view cannot be right, often on the basis of the common epistemic belief that “if we know anything it is that we are conscious”.

Premise 2: How much these anti-illusionist intuitions actually bias a researcher’s beliefs depends on the strength of the intuition: stronger the intuition the more probable and significant its effects. This premise is essentially true by definition. If we have a tendency to believe idea X is false, then by definition there is significant probability we will form the belief that X is false. A strong intuition that X is false essentially just means a strong probability one forms the belief X is false.

Premise 3: Consciousness researchers have unusually strong anti-illusionist intuitions relative to the general population of researchers. This premise is less obvious and more complex. Why should we believe consciousness researchers have unusually strong anti-illusionist intuitions relative to the general population of researchers?

This is an empirical claim and my evidence is admittedly anecdotal, but I think many others in the field will have had a similar experience. As a student, I seriously considered pursuing a career as a consciousness researcher. I quickly found there are not many jobs for consciousness researchers, to say the least, in both philosophy (whose job market is already abysmal) and science. Further, there is still some stigma (though not a strong as it once was) around researching consciousness, especially in the sciences, which could limit one’s career prospects even if one is lucky enough to find a job where they can investigate consciousness.

Given this situation, why would anyone try to become a consciousness researcher? When it comes to practical concerns, there are basically no reasons to pursue a career as a consciousness researcher and many reasons not to. Thus, there must be some other factor(s) motivating such people to do so.

As a student, I studied in multiple philosophy departments with other students and philosophers working on consciousness, and in UC Irvine’s cognitive science department, which is known to have an unusually large community of consciousness researchers. My experience talking to such researchers strongly suggests to me that these researchers are motivated by an unusual fascination with consciousness, so significant that their fascination outweighs the practical concerns that come with pursuing such a career. This makes sense. If one was not totally taken and fascinated with consciousness there would be little reason to pursue a career as a consciousness researcher.

Where does this unusually strong fascination originate? Why is consciousness so interesting to these people? Consciousness is interesting for multiple reasons. It, for example, is closely tied to moral theory, i.e., an agent’s ability to experience conscious states like pain are important for judging their moral status. Consciousness is also tied to ideas related to meaning: what, after all, is the point of life without first-person experience?

However, what makes consciousness unique relative to every other natural phenomena is its difficulty to explain: the hard problem of consciousness is often said to be the hardest problem in all of science. This sense of mystery is what got me so obsessed with consciousness, and my sense talking to other consciousness researchers was that it too was what made them obsessed. It is the deep mystery around consciousness that makes consciousness unique. It is what fills people like myself with a sense of wonder when thinking about consciousness. It is the reason people like myself get hooked on the subject and make risky career decisions to study it.

Notice, there is another way to describe this natural fascination with the deep mystery of consciousness: having a strong intuition/feeling that explaining consciousness is unusually difficult, and this is equivalent to having strong problem intuitions. Problems intuitions are those tendencies related to judging there is a hard problem of consciousness (see Chalmers original paper). Thus, having strong problem intuitions, put simply, just means having a strong sense there is a hard problem of consciousness. And thus, another way to say what I have just argued is that having strong problem intuitions is what drove many consciousness researchers to be fascinated with consciousness and to decide to go into consciousness research.

Now, if one has strong problem intuitions, this would also entail one has strong anti-illusionist intuitions. For example, having strong knowledge intuitions means have a strong sense that one has special knowledge of their own conscious experience, including its subjective character, etc. This entails premise 3: typically, one does not become a consciousness researcher unless one has unusually strong problem intuitions in the first place. Having strong problem intuitions, like knowledge intuitions, also means having strong anti-illusionist intuitions. Thus, consciousness researchers should be expected to have unusually strong anti-illusionist intuitions relative to the general population of researchers.

The result is that consciousness researchers will be much more influenced by their intuitions than the rest of the population of researchers, and they will be influenced in the direction of believing conscious is deeply mysterious and that views that contradict this, like illusionism, are false.

Now this argument does not necessarily mean realists are wrong, but it does raise the question of whether the popularity of realism is actually based on its philosophical plausibility rather than some naive and unusual intuitions held among consciousness researchers.

Conclusion

Much of this argument for proposition 3 is based on anecdotal data, i.e., my experience talking to graduate students, philosophers, and scientists working on consciousness. I think this anecdotal data is significant enough to justify proposing the hypothesis, but, of course, more empirical testing , possibly through surveys, will of course need to be done to determine how plausible it really is. I would love to see such testing.

An interesting implication of my argument, is that researchers who focus their research on topics other than consciousness, should naturally have less strong anti-illusionist intuitions than consciousness researchers. In my experience, there is plausibility to this implication as well. I find that neuroscientists and AI researchers, that do not work on consciousness, in particular, are much more likely to view the mind through an objective, mechanistic lens than a subjective, experience based one. The objective, mechanistic lens is much more conducive to the illusionist view: according to the illusionists, once we have a nice mechanistic explanation of why we think there is a hard problem we are done. No hard problem exists, so we only need to explain why we think one exists. One key example of an illusionist sympathizer in this community is Turing award winner and Nobel Laureate Geoffrey Hinton, who has explicitly stated his sympathies with illusionist thinking.

Whatever the case, I believe the arguments presented here and those previously by illusionists suggest that consciousness researchers should do a bit of honest self-study. Why do they find illusionism so unpalatable? Is it really due to good philosophical argument against the illusionist position? Or is it due largely naive, pre-philosophical, anti-illusionist intuitions that are unusually strong among consciousness researchers?

Consciousness, Illusionism, and the Existential Gap

By Nick Alonso

Philosophical discussions of consciousness took an interesting turn about a decade ago. Between the late 1970s and the early 2010s most philosophers discussed the seeming tension between science and consciousness, i.e., the objective, physics-based explanations that neuroscience provides us seems to fall short in explaining consciousness. Consciousness just seems to have a certain subjective character or what-it-is-likeness (e.g., the quality of what it is like to see red from the first person viewpoint) that does not seem amenable to objective, physics-based explanation. What this might imply about the nature of science and consciousness was the main focus of this discussion.

More recently, philosophers began to increasingly focus on a different question: is our sense that consciousness has some special subjective character accurate? Or might we be under some sort of cognitive illusion? The view that we are under some sort of illusion is called illusionism. Illusionism holds that the mysterious properties of consciousness that lie at the root of the hard problem do not exist, they only seem to exist, and therefore the hard problem of consciousness does not exist.

The nature of the disagreements between realists and illusionists are quite fascinating. For most realists, the issue with illusionism is not just that it is wrong, but that it is absurd. This absurdity almost always stems from a strong basic intuition that we have some certain knowledge about our own experience that is undeniable. How, after all, could we deny we have a stream of first-person experiences?!!

The aim of this post is for me to try to articulate the main reason I take illusionism seriously. I call this reason the existential gap. The basic idea is that from the third person point of view there is no good reason to believe that the mysterious properties at the root of the hard problem actually exist. Thus, from the third-person point of view, the one most scientists like myself privilege, there is no strong reason to believe in realism.

My ideas have been shaped by philosophers like Dan Dennett (e.g., here), Keith Frankish (e.g., here and here), Francois Kammerer (here), and scientists like Michael Graziano (here). However, I have yet to find particular paper that frames a motivation for illusionism in terms of a kind of existential gap. If any of you readers know of literature that has already framed components of the illusionist position in this or a similar way, please let me know in the comments!

Preliminaries

First, some definitions. By ‘consciousness’ I am referring to what philosophers call ‘phenomenal consciousness’, or subjective experience. I am not referring to self-consciousness, wakefulness, or other meanings of consciousness. Phenomenal consciousness (henceforth just consciousness or subjective experience) is at the root of the hard problem.

Examples of subjective experiences include visual, auditory, and tactile sensations, emotional feelings, and mental imagery. There is something it is like to have these mental states: there is something it is like to see red, to taste coffee, to feel pain, and to dream. What these experiences are like from the first person point of view, e.g., what red looks like or what pain feels like to the experiencer, is sometimes called phenomenality, subjective character, or what-it-is-likeness of the state. Let’s call the what-it-is-likeness/subjective character of a conscious state its WIL property.

The hard problem can be framed in these terms: e.g., how does certain neural activity give rise to mental states with WIL properties? The problem is that nothing about what we know about the brain seems to imply certain neural activity must yield a particular WIL property (e.g., an experience of red with a particular subjective character rather than no subjective character or a different kind).

Illusionists accept that the hard problem and WIL properties seem to exist, but deny that they actually do. From this claim, Illusionists have two options. First, they can argue that if WIL properties do not exist then consciousness does not exist, a view sometimes called eliminativism. Or they can argue that consciousness does exist, it just does not have any special WIL properties, a view we might call deflationism. My sense is that most illusionists take the deflationary route, saying consciousness does exist, it just does not have the mysterious properties is seems to. For more details on illusionism, I highly recommend the writings of Keith Frankish (e.g., here and here)

Science and The Existential Gap

The existential gap, which I develop below, is opposed to the widely discussed explanatory gap. The explanatory gap refers to the seeming gap between consciousness and objective, physical-functional descriptions of the brain. One way to frame this idea is that objective, physical-functional descriptions of the brain do not seem to entail facts WIL properties (e.g., physical-functional descriptions of the visual system do not seem to entail facts about what it is like to see the color red from the first-person point of view.)

The ontological gap refers to the potential metaphysical gap between brain and consciousness: if there is an ontological gap, the brain and consciousness are different kinds of things or have different kinds of properties: the brain has physical properties (location, mass, etc.), while consciousness has non-physical properties (e.g., subjective character/WIL properties).

Much of the philosophical debate over the last few decades has focused on the question whether the explanatory gap entails or suggests an ontological gap. Both the explanatory and ontological gap assume consciousness and its WIL properties exist.

Both the explanatory and ontological gap begin with the assumption WIL properties exist. Let’s, however, begin reasoning about consciousness under a more neutral assumption: WIL properties seem to exist to most people, but whether they actually do exist cannot be known prior to significant philosophical or scientific investigation. Let’s not just assume, in other words, prior to any philosophical or scientific investigation, that illusionists are wrong.

It is from this starting point that, I will now argue, that we find ourselves faced with a different kind of gap, what I call the existential gap. Its first approximation can be stated as follows:

  • The Existential Gap: objective, physical-functional descriptions of the brain and behavior do not entail or provide strong reason to believe WIL properties exist.

Notice how this differs from the explanatory gap. The explanatory gap states that objective, physical-functional descriptions of the brain do not entail facts about WIL properties and therefore do not fully explain them. The existential gap, on the other hand, states that objective, physical-functional descriptions of the brain and behavior do not entail or even provide strong reason to believe WIL properties exist.

The explanatory gap is a claim about the limitations of our understanding of consciousness, under the assumption WIL properties exist. The existential gap is a claim about the limitations of the justification for our belief in WIL properties, while making no initial assumptions about whether WIL properties exist or not.

With these basic distinctions in place we can dive deeper into why and how the existential gap exists. Knowledge associated with consciousness that can be acquired from the third person point of view come from two sources of data: neural and behavioral data. I begin with a claim about neural data.

  • Claim 1: there is no neural data that is difficult to explain in the deep philosophical sense that WIL properties seem to be. Nor is there anything observable in the brain that entails or provides strong reason to believe such properties exist.

When we look at a living, awake, conscious person’s brain with our own eyes or through some brain scanning or measuring device, we do not observe anything like WIL properties. Nor do we seem to observe anything that is hard to explain in the way WIL properties are supposed to be. I do not know of any philosopher or scientist who would deny this claim. Further, while the brain is complicated, there is no deep philosophical problem preventing our understanding (in principle) of its observable physical-computational processes like there is for WIL properties.

However, even if WIL properties are not directly observable in the brain from the third-person point of view, it could still be that they are implied/strongly suggested to exist by something that is directly observable in the brain. The Higgs Boson particle, for instance, was implied to exist by certain mathematical theories/models in physics before its actual measurement. Maybe WIL properties are similar. However, nothing we know of yet about the brain, and no well-justified theories that seek to explain neural data, imply the existence of WIL properties or strongly suggests they exist. Those who disagree have the burden of finding a theory or model from neural science, that does not assume the existence of WIL properties a priori, which does indeed entail there existence. As someone who has worked on formal theories from neuroscience, like predictive coding, I do not know if any such theory.

It is important to emphasize: if we assume WIL properties exist and that WIL properties are identical to some observable stuff in the brain, then that observable stuff in the brain implies the existence of WIL properties. However, such theories would not be consistent with our neutral assumption about the existence of WIL properties. Formal cognitive and neural models of the brain that were built to fit neural data simply do not imply the existence of anything like WIL properties.

These same points apply to behavioral data.

  • Claim 2: There is no behavioral data that entails or strongly suggests the existence of WIL properties. This behavioral data includes people’s strong tendency to claim WIL properties exist, and any other associated behavior.

Consider an intuitive argument: the fact that people tend to claim they have conscious states with WIL properties needs an explanation, i.e., it needs a description of what causes that behavior. The most plausible explanation is that WIL properties cause people to claim they have WIL properties…duh.

Although this argument is intuitive, it is wrong. Any human behavior has a (causal) explanation that is entirely consciousness-neutral, i.e. an explanation that only involves concecpts of neural or cognitive-computational mechanisms. There has never been a behavior observed that could not be accounted for sufficiently by the physical-computational happenings in the brain and peripheral nervous system. The physical-computational happenings in the brain do not provide reason to believe in WIL properties (see above), and their causal effects on behavior should not either. Your claim that you are conscious can be sufficiently explained in terms of your neural mechanisms and cognitive level computations. There is no reason to think there is anything extra going on just because we tend to say WIL properties exist.

This may sound unusual to some. After all, what sorts of cognitive/neural processes would cause us to say WIL properties exist, even if WIL properties do not exist? Why would our cognitive systems do this? This question has now been discussed at length by philosophers and some cognitive scientists. For a review of various ideas, see David Chalmers paper on the Meta-Problem. The general idea is that certain cognitive systems in your brain make judgments about your own mental states. Due to certain quirks about the way your brain represents information about itself and the physical world, and quirks in the functioning of your cognition, your cognitive systems make certain faulty conclusions about the nature of your own mental states, not unlike the way your visual or auditory systems have quirks that lead them to create certain faulty perceptual representation (e.g., optical illusions). In the end, the fact that people tend to say they have mental states with WIL properties only entails that there is some cognitive-computational process, implemented in neural mechanisms, causing this behavior. It does not alone entail WIL properties exist, nor does this behavior alone provide a clear strong reason to believe there is something beyond the cognitive and neural processes that cause the behavior.

So neural data and behavioral data do not provide strong reason to believe in the existence of WIL properties. Then why do most scientists and philosophers interested in consciousness seem to believe in WIL properties? I cannot speak for everyone, but every argument I have personally heard from scientists and philosophers, and the assumptions they seem to implicitly make, always reverts back to first-person experience, e.g., they will say something along the lines of “If I know anything it is that I have conscious states with subjective character! It is undeniable knowledge that it is like something for me to see red, taste coffee, etc.!”

  • Claim 3: there has yet to be a plausible explanation of how a brain gains infallible knowledge of something like WIL properties.

Remember, we started with a neutral assumption about whether WIL properties exist. Simply asserting that one knows they exist prior to empirical or rigorous philosophical investigation violates this assumption. However, we can still ask the question of whether some cognitive system in the brain exists that makes (nearly) infallible judgments about our own mental states. We can investigate such a system empirically, and if such a system seems to exists, and it concludes WIL properties exist, this might provide some reason to believe in WIL properties.

But, how would such an infallible system even work? Appeals to an “I” that “knows” with certainty about its own WIL properties is not helpful. After all, what the heck is an “I” and how does this perfectly reliable knowledge relation between the ‘I’ and WIL properties even work? These are question illusionists like Dennett have been asking for decades (see also here by Frankish).

More plausibly, there is no special ‘I’ in the brain with special, infallible knowledge, a point Dan Dennett (e.g., here) has be making for decades. Most cognitive scientists hold there is some sort of dedicated meta-cognitive system in the brain tasked with identifying and reasoning about our own and others’ mental states. The most plausible view of such a system is that it is fallible. Sure, maybe our meta-cognitive system has some privileged access to our own mental states that outside observers do not, but they will not have infallible access. All cognitive systems are imperfect, due to time, computation, and memory constraints, etc.

Further, we know certain systems in the brain, although highly useful, systematically misrepresent things. The human perceptual system systematically falls prey to certain illusions: present the human visual system with a certain kind of visual stimuli, and the subject will claim something is in the environment that is not. It is therefore perfectly possible that the same is true of meta-cognition: present our meta-cognitive system with a certain kind of input, and the subject will claim something is in their mind that is not.

If we can be deceived about the external world because of quirks in our perceptual systems, we can also deceived about our own internal mental world because of quirks in our meta-cognitive system. Appeals to first person experience are not an argument/or reason against this point, simply an assertion of the conclusion one is trying to support. Again, we are left with no convincing reason to believe WIL properties exist.

Conclusions

The existential gap implies something important: for the many scientists like myself who believe our best bet at understanding nature is to take up the third-person point of view and apply the scientific method, Illusionism is not only far from absurd but it seems inescapable. Nothing we can understand about nature from the third-person point of view implies or even strongly suggests WIL properties actually exist.

Those who vehemently oppose illusionism tend to do so on the basis of their first person experience, claiming that from the first-person point of view, they/the mysterious ‘I’ have certain knowledge of the existence of WIL properties: WIL properties are data that needs to be explained by science, period. The problem is these claims are made without providing any scientifically plausible account of what this ‘I’ even is or how it could come to have certain knowledge of WIL properties. Without such an account, it seems a better starting point is to investigate scientifically whether WIL properties actually exist. However, once we do this, the existential gap presents itself, and we seem unable to find our way back to any great justification to believe in WIL properties or the hard problem.

The Meta-Problem Test for AI Consciousness

by Nick Alonso

Artificial intelligence (AI) is rapidly improving and its progress shows no signs of slowing. Our understanding of consciousness, on the other hand, is limited and is progressing at a snails pace. This situation raises an obvious question: how should we judge whether advanced AI are conscious, given our understanding of consciousness is, and will likely continue to be, so limited?

Recent work has proposed two ways to approach this question: 1) use our best scientific theories of consciousness as a basis to make such judgments, or 2) devise and use a behavioral test to make such judgments. Both approaches, as I note below, face challenges.

In this post, I develop the beginnings of an alternative test for AI consciousness based on the ‘Meta-Problem’ of consciousness that does not fit cleanly in either of these categories. The meta-problem is the problem of explaining why people think they are conscious. For reasons I outline below, I believe this novel approach has interesting advantages over the other main approaches. I sketch the foundational ideas here in the hopes it could serve as a basis for further development.

Preliminaries

Before diving in, I need to define terms. The word ‘consciousness’ is ambiguous in the sense that people use it to refer to several different mental phenomena. ‘Self-consciousness’ refers, roughly, to the ability to mentally represent one’s own body and internal workings and to distinguish one’s self from the environment. A mental state or information in the brain is ‘access conscious’ if it is made widely available to a variety of cognitive systems. These two common definitions of consciousness are not what I will be discussing here. These are not especially mysterious or deeply difficult to identify in AI.

The sort of consciousness I will be discussing is what philosophers call ‘phenomenal consciousness’ or ‘subjective experience’. Examples of subjective experiences include first person experiences of color, pain/pleasure, sounds, tastes, smells, and emotions. For reasons I will not be getting into here, this is the mysterious sort of consciousness that has and continues to puzzle philosophers and scientists. The question of whether some AI can be conscious under this definition can be framed as, does the AI have first person experiences of the world? Does it have an inner mental life? Or is it all ‘dark inside’ for the AI, devoid of inner experience?

Most scientists and philosophers who study the subject will agree we are far from an agreed upon theory of the nature of subjective experience and its relation to the brain which creates an obvious challenge when trying to identify consciousness in AI.

This challenge is concerning from a moral point of view since many moral philosophers agree an agent’s moral status (i.e., what moral rights it does or does not have) depends in important ways on what conscious states it is capable of having, especially on its ability to have experiences with a pain/pleasure component. For a good discussion of this concern, I recommend the writings of philosopher Eric Schwitzgebel (e.g., here).

Background

I find it useful to split the problem of identifying consciousness in AI into two problems:

1) Identifying whether AI can be conscious. Following philosopher Susan Schneider, this is the problem of determining whether silicon (or otherwise non-carbon based) AI can support conscious experience, in principle. Some theories of consciousness entail artificial, non-carbon based systems can be conscious. Other theories disagree.

2) Identifying which AI systems are conscious. Even if we assume/discover non-carbon based artificial systems could support consciousness, in principle, there still remains the question of which subset of artificial systems are conscious and to what degree. This problem is analogous to the problem of identifying which subset of animals are conscious and to what degree, given we know biological-carbon based brains can support consciousness.

Philosophers and scientists who study consciousness commonly believe silicon can support consciousness, though agreement certainly is not universal. I suspect support for this belief will only grow in the coming years, and I make this assumption in what follows. Under this assumption, problem 2, the problem of which AI are conscious, becomes the focus.

How might we approach identifying which AI are conscious? Possible approaches are typically categorized into two types.

The theory-driven approach. This approach aims to use our best scientific theories of consciousness as a basis for making judgments about AI consciousness. For example, Butlin, Long, et al. recently showed that a handful of the leading theories of consciousness all seem to agree that certain computational properties are important for consciousness. They argue that these commonalities can be treated as ‘indicators’ of consciousness and used to make judgments about which AI are and are not conscious.

Challenge: The main challenge for theory-driven approaches is a deep lack of a consensus around the scientific methods for studying consciousness, around the requirements for a proper theory of consciousness, and around a theory of consciousness itself. This lack of consensus and shaky foundations suggest we should have limited confidence in, even our leading, theories of consciousness. For example, although Butlin, Long et al. provide a well developed theory-driven test for AI consciousness based on several popular theories, it is unclear how much trust we should put in the indicator properties pulled from these theories.

The theory-neutral approach. Theory-neutral approaches avoid the difficulties of the theory-driven approach by staying largely neutral with respect to scientific and philosophical theories of consciousness. Instead, consciousness-neutral approaches typically devise some sort of behavioral test that could help us determine whether some AI is conscious. One example, proposed by Susan Schneider and Edwin Turner, argues that if we train an AI model such that it is never taught anything about consciousness, yet it still ends up pondering the nature of consciousness, there is sufficient reason to believe it is conscious. Schneider and Turner imagine running this test on something like an advanced chatbot by asking it questions that avoid using the word ‘consciousness’, such as ‘would you survive the deletion of your program?’ The idea is that in order to provide a reasonable response, the AI would require a concept of something like consciousness, and the concept would have to originate from the AI’s inner conscious mental life, since the AI was not explicitly taught the concept during training.

Challenge: Philosophers have challenged theory-neutral approaches like this on the grounds that it seems possible, under a significant variety of views about consciousness, for a non-conscious AI to learn to act as if it is conscious, even when the AI is not explicitly taught to do so. Behavioral tests like those mentioned above would be unable to distinguish between non-conscious AI that learn to talk as if they are conscious from truly conscious AI. The reason theory-neutral approaches have this difficulty seems to be that they are too insensitive to the computational mechanisms causing the verbal reports, leaving open the possibility for non-conscious systems/cognitive mechanisms to generate behaviors that mimic those of conscious systems. To add to this problem, most of these tests rely on verbal reports and thus only apply to AI that can respond verbally.

The Meta-Problem of Consciousness

Below I present an alternative test for AI consciousness which can be interpreted as a middle point between theory-neutral and theory-driven approaches. The rough, first approximation of the test can be summarized as follows: if an AI says it is conscious for the same cognitive-computational reason that humans do, there is sufficient reason to believe the AI is conscious.

In order to develop this initial idea into a more rigorous and philosophically grounded test, we need to unpack a bit more what this statement means. First, what do I mean by “an AI says it is conscious for the same cognitive-computational reason that humans do“?

Humans who reflect on their own minds, tend to conclude they are conscious. We conclude we have a stream of first-person experiences, an inner mental life. We often conclude this stream of first person experience seems distinct from the neural mechanisms that underlie it, and make other related conclusions.

Now, why do people, who reflect on their own minds, tend to think these things? The standard line of thinking goes something like this: we think such things, of course, because we are conscious! We have conscious experiences, and these conscious experiences cause us to think we have conscious experiences.

This is the intuitive explanation, but it is not the only one. Philosophers and cognitive scientists have, in particular, shown that there exist consciousness-neutral explanations of the same behavior. That is, there exist explanations of why we think we are conscious that do not involve the term or concept of phenomenal consciousness.

Here is the basic idea: every behavior, including the behavior of saying you are conscious, is caused by some internal neural process, which implements some more abstract, cognitive-level computations. A description of a behavior’s neural and cognitive causes is what cognitive scientists and neuroscientists count as an explanation of the behavior: if we fully describe the neural and cognitive-computational processes that generate some common behavior X, then we have a (causal/mechanistic) explanation of why we observe X.

This line of thinking applies to any behavior, including common human behaviors associated with consciousness, such as our tendency to say we have a stream of first-person experiences. Put simply, there is some neural process, which implements more abstract cognitive level computations, that causes behaviors associated with consciousness. Thus, we can explain these behaviors in a consciousness-neutral way, in terms of these neural and cognitive mechanisms.

The problem of explaining why we think we have mysterious conscious states, has been given a name and developed into a research program by philosopher David Chalmers, who calls it the meta-problem of consciousness. The meta-problem gets its name from the fact that it is a problem about a problem: it is the problem of explaining why we think we have consciousness states that are problematic to explain.

What is nice about these recent developments on the meta-problem by Chalmers and others, is that they provide ideas which, as I will explain, are useful for setting the foundation for a test for AI consciousness based on our preliminary idea above (i.e., the idea that if an AI claims it is conscious for the same cognitive reason people do, it should be identified as conscious). In particular, I focus on three ideas developed around the meta-problem and use them to develop a test for AI consciousness.

First, is the idea that our tendency to say or judge we have consciousness involves two parts: a lower order mental model, usually some sort of perceptual-like model and a higher-order model, which represents the lower order model (see Chalmers, 2019, pp. 40-45). The higher-order model can be described as the high-level ‘thought’ about the lower order model, and the lower order model can, in this way, be thought of as the mechanistic source of our thoughts about consciousness. To simplify terminology, I will just call this mechanistic source of our judgments about consciousness the c-source.

To make this more concrete, consider an example of a c-source which comes from attention schema theory (AST). AST is one of the most well-developed approaches to the meta-problem. AST claims the brain has a simplified model of attention, an attention schema, which it uses to control attention. This simplified model represents attention as a simple mental relation. So, when our cognition reads out the content from this model, it concludes we have some simple, primitive mental relations between ourselves and features of the world. We call these simple, mental relations ‘awareness’ or ‘experience’ or ‘consciousness’, and conclude they seem different, and non-reducible to the physical stuff in our brain. Now, AST may be incorrect, but it provides a clear illustration of what a c-source could be, i.e., a simplified mental model of attention.

The second idea is the hypothesis our c-source must be very closely related to consciousness. Now this point is not essential to the meta-problem. However, Chalmers argues for this point (see 2019, pp. 49-56), and it makes some sense. Most people assume conscious experience is itself the source of our claims we are conscious. As such, it would be highly surprising if the c-source, which play roughly the same causal role in cognition as consciousness, has no essential relation consciousness. For example, if AST is true, it would be very surprising if we concluded attention schemas had nothing to do with consciousness, since AST seems to entail that a thought like ‘I am have conscious experience X’ is actually referring to an attention schema (and/or their contents)! The c-source is in some sense what we are thinking about what we think about consciousness, and it thus seems that in order to avoid certain philosophical problems (explained in the meta-problem paper but not here), a theory of consciousness must assume there is some essential relation between consciousness and the c-source.

The third idea is a general/high-level description of what the c-source likely is. Now there is no consensus around a solution to the meta-problem. However, dozens of ideas have been developed, in addition to AST, which can be used as a starting point. Similar to the theory-driven approach of Butlin, Long, et al., we can look at promising approaches to the meta-problem and ask whether there are any commonalities between them that can be used as a basis for an AI consciousness test. Fortunately, Chalmers provides an extensive review of proposed solutions to the meta-problem, summarizes some common threads between them, and synthesizes ideas he found promising, which I find promising as well. Here is his summary

We have introspective models deploying introspective concepts of our internal states that are largely independent of our physical concepts. These concepts are introspectively opaque, not revealing any of the underlying physical or computational mechanisms. Our perceptual models perceptually attribute primitive perceptual qualities to the world, and our introspective models attribute primitive mental relations to those qualities. We seem to have immediate knowledge that we stand in these primitive mental relations to primitive qualities, and we have the sense of being acquainted with them .

Chalmers, The Meta-Problem of Consciousness (2018, p.34).

There is a lot to unpack in this passage. Most of it is out of the scope of this post. The part important for understand the c-source, Chalmers suggests, likely involves simplified mental models of features/qualities in the world and simplified models of mental relations. The idea is our brain models certain features in the world (e.g., color) as primitive (irreducible) qualities, and our brain models certain mental relations as primitive (irreducible) relations between us and these qualities. This gives us the sense we have direct first person experiences of qualities like, e.g., color, sound, and tactile sensations. These simplified models are thus our c-sources. This idea may not be completely right, but it is a nice starting point that encompasses the claims of a range of proposed solutions to the meta-problem, including prominent ones like AST.

Summary of the foundational ideas for the MPT:

  • There is a cognitive source of our tendency to judge we are conscious (what I call the c-source), which can be described in consciousness-neutral terms.
  • There are a priori (philosophical) reasons to believe the c-source has an essential link to consciousness.
  • An initial theory that encompasses ideas from a range of hypotheses of the c-source says the c-source is a set of simplified/compressed mental representations of features in the world and our mental relations to the features.

The Meta-Problem Test for AI Consciousness

I will now use these ideas to construct what I will call the meta-problem test (MPT) for AI consciousness. After presenting the MPT here, I discuss its advantages over the other approaches. My proposal for the MPT rests on the following assumption

The Foundational Assumption for the MPT: the presence of the c-source in an AI provides sufficient reason to believe the AI is conscious (under the assumption silicon-based systems can be conscious, in principle). The absence of the c-source is sufficient reason to believe an AI is not conscious.

This assumption does not claim the presence of a c-source is sufficient for consciousness, just sufficient for us to believe consciousness is present. Thus, the claim it is making about consciousness is relatively weak, as it makes no specific claims about the necessary and sufficient conditions for consciousness, what consciousness is, or its metaphysical relation to the brain. It just says the presence of the c-source is evidence enough for consciousness, and its absence is evidence enough for consciousness’ absence.

This assumption is supported by the second idea discussed above, which is that the c-source likely has some essential relation to consciousness. Further philosophical work is needed to determine the plausibility of this point. As noted above, Chalmers provides some arguments as to why we should think there is some direct, essential relation between consciousness and, what I am calling, the c-source, (see pp. 40-45). I will not got into the details of these arguments here but will try to in future posts. For now, I only note that this foundational assumption has prima facie plausibility.

Under this assumption we can lay out the basic steps for implementing the (MPT):

  1. Put cognitive scientists to work on the meta-problem until the field converges to an explanation of what the c-source is in people, using Chalmers’ suggestion, and related theories, as a starting point.
  2. Develop systematic methods for identifying the c-source in AI.
  3. Judge those AI with the c-source to be conscious. Judge those AI without the c-source to be unconscious.

The Advantages of the MPT

How does the MPT compare to more standard theory-driven and theory-neutral approaches? Interestingly, the MPT does not cleanly fit into either category.

Like the theory-driven approach the MPT proposes using a scientific theory of a mental process as a basis for making judgments about AI consciousness. However, unlike the theory-driven approach, the mental process this theory is about is not consciousness itself, but instead a cognitive process that is very closely related to consciousness, the c-source.

Like theory-neutral approaches the MPT remains relatively neutral w.r.t. scientific and philosophical theories of consciousness, with the exception of its foundational assumption. However, unlike typical theory-neutral approaches, the MPT still relies on test for a certain mental process (the c-source) based on scientific theories of what that mental process is. Also, the MPT is a cognitive-based test: it tests whether a certain cognitive process/representation is present and if it is we judge consciousness is present too, whereas theory-neutral approaches tend to be more behavior focused and less sensitive to the cognitive mechanisms generating the behavior in question.

The MPT thus does not fit neatly into either category, yet it shares some properties with both approaches. Interestingly, this middle ground may allow the MPT to avoid the main problems associated with theory-neutral and theory-driven approaches, while keeping some of their advantages.

More specifically, the advantages of the MPA over theory-driven approaches are

  1. The MPT does not rely on any scientific theories of consciousness, and therefore avoids the uncertainty that necessarily comes with these theories and their assumptions. Although the MPT does rely on the assumption that the presence of the c-source is sufficient reason to believe consciousness is present, this assumption is much weaker, and therefore should be more easily defended, than the stronger more specific claims of scientific theories of consciousness.
  2. The theories the MPT does rely on attempt to explain a cognitive process, the c-source, which is not philosophically problematic like consciousness. As such it is much more likely we can make progress on finding widely accepted theories of the c-source in the short-term than we can with theories of consciousness.

The advantages of the MPT over theory-neutral approaches are as follows:

  1. Arguably, the central issue for behavioral-based, theory-neutral tests is that such tests cannot distinguish reliably enough between non-conscious AI that behave as if they are conscious, from genuine conscious AI. This issue is largely a result of the fact these tests are too insensitive to the cognitive processes generating the relevant behaviors. The MPT, alternatively, is a cognitive-based test that only identifies those AI with a certain kind of cognitive process to be conscious. If the foundational assumption of the MPT is correct, the MPT avoid the central issue facing behavioral-based tests, as it will be able to reliably distinguish between non-conscious AI that merely behave as if they are conscious (which are those that behave conscious-like but through a process unrelated to the c-source) from those that actually are conscious (which will be those that behave conscious-like through a process that is rooted in the c-source).
  2. The MPT, unlike many proposed theory-neutral approaches, does not rely solely on language generation. The MPT can also use non-language based tests. For example, if AST turns out to be true, we could test for the presence of attention schemas in an AI by having it perform tasks that require a certain kind of control over its attention, which only an attention schema can provide. Further, we can also perform the MPT using non-behavioral tests which directly probe the AI’s computational mechanisms for the c-source, e.g., directly study an AI’s artificial neural network for the presence of self-representations of attention. This allows us to test for consciousness in a wider variety of AI than linguistic tests allow for and it allows us to evidence our judgments about AI consciousness in a wider variety of ways than linguistic and behavioral tests can alone.

Current Limitations of the MPT

The MPT is not without limitations:

  1. The MPT approach is based on the assumption that the c-source is a very reliable indicator of consciousness. Although there is, in my opinion, good philosophical arguments for this assumption, more discussion is needed to determine whether it can be justified sufficiently to justify the use of the MPT.
  2. The MPT approach requires that cognitive scientists actually make progress on the meta-problem. Very few cognitive scientists and consciousness researchers are currently working on the meta-problem directly. It remains to be seen if consciousness scientists will shift focus in the near-term.

Conclusions

  • The MPT offers a kind of middle point between typical theory-driven and theory-neutral approaches for AI consciousness tests.
  • As such, the MPT seems to retain the best aspects of both, while avoiding the problems associated with each. This suggests the MPT is a promising avenue for further development.
  • The success of the MPT depends heavily on the foundational assumption, that the c-source has some essential tie to consciousness. This assumption has prima facie plausibility but further analysis of it is needed.
  • The success of the MPT also depends on cognitive scientists converging toward a consensus on the c-source. This will only happen if significantly more scientists work directly on the meta-problem than there currently are.

Toward Loss Functions that Better Track Intelligence

by Nick Alonso

The standard way of measuring performance in deep learning is through a test loss: a neural network is trained to perform some task, and the goal is to get the network to reduce the loss as much as possible on a subset of the data that is not presented during training (i.e., the test set). A low test loss means the model performs the task well when applied to new data points that were not previously observed. The test loss, however, does not account for how quickly the model learns. It provides a measure of how well the model is performing at the time the test is performed but ignores the time and effort it took the model to get to that performance.

This way of measuring performance is well and good if our only interest is to have some model eventually reach a desired state of competence. However, we often have an interest in the amount of computation, energy, and time costs associated with training the model. Further, this loss measure seems to me to be largely indifferent to an essential aspect of intelligence: learning efficiency. Simple examples and existing theories of intelligence make clear that intelligent systems are not just those that eventually reach some state of high competence on some set of tasks. They are those that also do so efficiently, where ‘efficiency’ roughly corresponds with the amount of data the model needs to learn a task, which is also correlated with training speed, energy costs, etc. If this is right it would have interesting implications, e.g. recent versions of GPT are not as intelligent as they may seem on first impression. These models may be incredibly competent at generating human sounding language in response to prompts and incredibly knowledgeable, but they are just too data hungry to be near the intelligence of more efficient learners, like humans and some complex animals.

Why does learning efficiency matter for intelligence? Here’s one example. Most people think child prodigies are highly intelligent. Consider the chess prodigy who is playing competitively with professionals three times his age, or the child who graduates high school at age nine. Why is it so obvious these children are smarter than most others? It is not the knowledge and skills they have. That is, it is not their competence at certain tasks that is so impressive. Though the chess prodigy may be very good, there exist many professionals who have equal or better skill than him, and it is unlikely the 9 year old high school graduate has more factual knowledge or skills than others who have graduated high school and college who are much older. The reason we universally acknowledge the intelligence of these children is because of how efficiently they learn compared most other people. They acquire more knowledge and skill from less data and less practice than just about everyone else. We admire not the end result, but the ease and speed with which they achieved the end result. Thus, any theory of intelligence must account not just for knowledge/competence/skill, but also the efficiency with which that knowledge and skill is acquired. Examples like these have motivated some researchers to develop formal theories of intelligence that explicitly take this intuitive notion of efficiency into account (e.g., [1]).

The test loss measure of performance, therefore, completely misses a crucial ingredient of intelligence. It is not sensitive to how efficiently the model learns. How might we better track intelligence? We could begin with a formal theory of intelligence to develop a measure. Some of these theories may be a bit cumbersome, however, in practice since many require expensive computations, which may limit the ability to frequently track a model’s progress. Further, these theories make very specific claims, and do not all agree, which raises the question of, which theory, if any, gets all the details right?

Alternatively, I suggest the cumulative loss may provide a simpler, easy to compute method that better tracks intelligence than the test loss. I do not claim the cumulative loss is a measure of intelligence, but I think it may better correlate with intelligence than the test loss. The cumulative loss is simple. Let \mathcal{L}^t be some loss measure computed at training iteration t given the data at t and the parameters \theta^{t-1} computed from the previous iteration. The cumulative loss is just

\mathcal{L}_{cumulative} = \frac{1}{T} \sum^T_{t=0} \mathcal{L}^t

That is, the cumulative loss is just the losses computed at each training iteration averaged together. Notice that this loss is very easy to compute and is task general. However, it is sensitive to how the model performs early in training. Consider two models/neural networks. Model 1 learns slowly so that it has high loss early in training, while model 2 trains quickly. Both converge to the same loss by the end of training. As desired, the second model (the more efficient learner) would have a lower (better) cumulative loss since it has lower losses early in training and these losses are incorporated into the cumulative loss.

Now consider the case where both models learn with equal efficiency (e.g., both converge or nearly converge after the same number of updates) but model 1 converges to a better loss. Model 1 will have a lower cumulative loss in this case, which seems to correctly track the model we would say is more intelligent.

Finally, consider the case where model 1 converges much more quickly than model 2, but model 2 converges to a better loss by the end of training. Which model is more intelligent? My intuitions are that it is not clear, which is also true of the cumulative loss: it is not clear from the details given which model would have better cumulative loss. More details are needed on how much more quickly model 1 reduced the loss compared to model 2, how much better the performance of model 2 was by the end of training than model 1, and how long the two trained for. Most people I suspect will likely have mixed feelings in these cases about which model is more intelligent, which means it is a theoretical gray area. In such cases, our loss measure, if it is to track intelligence, should measure the two models as having similar losses. This is just what the cumulative loss will end up doing, which is why we cannot tell which model has a higher cumulative loss from the details given (more details are needed because the two will have similar cumulative losses in this case). It would be interesting empirically to see how well this loss measure tracks common judgments and various theories of intelligence, but just from this bit of analysis it is clear it may be a good start.

References

Chollet, F. (2019). On the measure of intelligence. arXiv preprint arXiv:1911.01547.

Online Machine Learning: What it is and Why it Matters

Introduction

In the field of deep learning, the standard training paradigm is offline learning. During offline learning 1) parameter updates are mini-batched (i.e., updates from multiple datapoints are averaged together each iteration) and 2) training is performed for multiple epochs (i.e., multiple passes over the training dataset). The goal is to get the neural network to minimize some loss function on a test/hold out set of data as much as possible by the end of training. The tacit assumption behind offline training is that neural networks are trained first on a stored/locally generated dataset, then deployed to perform the task on some new data (e.g., only after AlphaGo was trained did it then compete against professional human Go players in publicly viewed matches).

Though offline learning is currently standard, I’ve recently come to appreciate the importance of online learning, a learning scenario which is distinct in important ways from offline learning. Online learning better describes the learning scenario humans and animals face and it has an increasing number of applications in machine learning. However, it seems to me most researchers in deep learning and computational neuroscience are not working directly or thinking deeply about online learning. Part of the issue may be that the term ‘online learning’ is often not well defined in the literature. Additionally, I suspect many believe the differences between offline and online learning are not mathematically interesting, or they believe that the best performing algorithms and architectures for offline scenarios will also be the best in online scenarios. However, there are reasons to believe this is not quite right. In what follows, I describe what online learning is, how it is different in interesting ways from offline learning, and why neuroscientists and machine learning researchers should care about it.

What it is Online Machine Learning?

First, it should be said that ‘online learning’ is not synonymous with ‘continual learning‘. Continual learning is the learning scenario where a model must learn to reduce some loss across multiple tasks that are not independent and identically distributed (i.i.d) (Hadshell et al., 2020). For example, it is common in continual learning scenarios to present tasks sequentially in blocks, such that the model is first presented with data from one task, then the data from a second task, and so on. The difficulty is in preventing the model from forgetting previously learned tasks when presented with new ones. Continual learning is a popular topic in deep learning, and many high-performing solutions have been proposed.

Formal work on online learning often makes the assumption data is i.i.d, although combined online-continual learning scenarios have also been worked on (e.g., Hayes et al. 2022). Informally, online learning can be described as having at least the following properties: 1) each training iteration a single datapoint is presented to the model, 2) each datapoint is presented to the model only once (one epoch of training), and 3) the model’s goal is to minimize the loss averaged over every training iteration (called the cumulative loss, defined below). For examples of classic/widely cited papers that use this description, see Crammer et al. (2006), Daniely et al. (2015), and Shalev-Shwartz (2012) . This is opposed to offline scenarios where multiple datapoints (i.e., mini-batches or batches) of datapoints are presented each iteration, each datapoint is presented multiple time (i.e., multiple epochs of training), and the model’s goal is to minimize the loss on some hold-out/test data as much as possible by the end of training.

Formally, the online learning objective is the cumulative loss. Consider the scenario where a model is given a single datapoint x^t and prediction target y^t each iteration t. The model, parameterized by \theta^{t-1}, must try to predict y^t given x^t as input. After the prediction is outputted, feedback is provided in the form of a loss \mathcal{L}(y^t, \hat{y}^t, \theta^{t-1}), where \hat{y} is the prediction generated by \theta^{t-1} given x^t. In this case the cumulative loss is:

\mathcal{L}_{cumulative} = \frac{1}{T}\sum_{t=1}^T \mathcal{L}(y^t, \hat{y}^t , \theta^{t-1}).

The cumulative loss is the average loss produced by the model during training on each data point as it was received in the sequence. In order to achieve a good cumulative loss the model not only must eventually perform well but must improve its performance quickly, since the losses produced at early iteration are factored into the final cumulative loss. A similar loss called the ‘regret’ is also sometimes used in online scenarios. The regret is just the cumulative loss achieved by the optimal parameters (e.g., parameters pretrained offline), minus the cumulative loss.

Compare this to offline learning, where the loss is averaged/summed over a hold-out/test dataset that is not used to train the model:

\mathcal{L}_{test} = \frac{1}{N}\sum_n^N \mathcal{L}(x_{test}^n, y_{test}^n, \theta^T),

where n refers to the datapoint in the test set and \theta^T are the parameters at iteration T. The test loss describes how well the model is doing right now at the current training iteration T. Thus, minimizing the test loss only requires that the model perform well eventually, by the end of training. It does not factor in how how the model performed at earlier training iterations. For the same reasons, the cumulative loss is distinct from the training loss, which describes how the current parameters \theta^T perform on the entire training dataset.

Some examples of online learners may include spam bots, surveillance devices, robots, autonomous vehicles, animal brains, and human brains. In all of these cases, the models are often assessed by how well they perform/learn while they are being deployed, and in all of these scenarios input data is generated one at a time, and no two datapoints are exactly the same (due to noise and the infinite possible varieties of inputs these models could receive from the real world).

In addition to the properties listed above, online scenarios often face other problems that are not standard in offline scenarios like concept drift and imbalanced data. Concept drift refers to the event where the underlying processes generating the data change (e.g., a robot moves to a new environment) (for review see Lu et al. (2018)). Imbalanced data refers to datasets where there are uneven numbers of training instances across classes/tasks (e.g., a robot in the desert may see 1,000 rocks for every one cactus). Dealing with these two scenarios are common topics in the online learning literature (e.g., see Hayes et al., (2022)).

The Mathematical Distinction between Online and Offline Learning Matters

I suspect some machine learning researchers assume that any learning algorithm that performs well in offline learning scenarios will also perform well in online learning scenarios. This judgment is not necessarily true and the mathematical distinction between the two learning scenarios makes clear why. Here are two important distinctions:

1. The online learning objective places more emphasis on reducing the loss quickly than offline learning objectives. Minimizing cumulative loss requires a learning algorithm not only update the model to perform well eventually, but also to perform well early in training. Thus, cumulative loss is affected by the rate at which the learning algorithm reduces the loss. Test loss is not directly affected by the speed of the algorithm. The test loss only cares how well the model is performing at the end of training (or the iteration where it is measured), and usually one trains long enough for the model to converge. Thus, a slow algorithm that reduces test loss well in offline scenarios may not reduce cumulative loss well in online scenarios.

2. Convergence guarantees are less important in online learning scenarios than offline learning scenarios. In offline scenarios, it is typically desired that a learning algorithm be guaranteed to converge to a minimum of the test loss after some number of training iterations (and epochs), since models can often be trained to convergence in offline scenarios. However, in online scenarios convergence is not always achievable, since the model is only doing a single pass through the data and in some cases there is not enough data for the model to converge. This is also the case with infinite/streaming data scenarios when concept drift occurs, and the model only trains for a finite period of time before the data distribution changes. In online scenarios, it may be desirable to sacrifice convergence guarantees for gains in training speed and low computation overhead. An example of an algorithm that is not guaranteed to converge but performs well in online scenarios are winner-take-all clustering models (e.g., see Hayes et al. (2020, 2022) and Zhong (2005) for example of online winner take all clustering. See Neal and Hinton (1998) for discussion of why winner-take-all clustering lacks the convergence guarantees of expectation maximization).

These differences imply that not all good solutions to the offline learning problem are good solutions to the online problem. We can, for instance, imagine an algorithm that reduces the loss slowly but consistently converges to very good local minima when trained with large mini-batches for many epochs. This algorithm would be great for offline learning, but is non-ideal for online learning. Interestingly, the standard learning algorithm used in deep learning is backpropagation which implements stochastic gradient descent (SGD). It is well known that SGD finds very good local minima in deep neural networks in offline training but trains slowly.

Understanding the Brain Requires Understanding Online Learning

Here’s an argument for why computational neuroscientists should care about online learning:

  1. Humans and animals learn online during most of their waking lives. It therefore seems safe to assume our brains evolved learning algorithms that deal well with the online scenario specifically.
  2. Solutions to the offline learning scenario will not always be the best solutions to the online scenario, for reasons noted above.
  3. Therefore, computational neuroscientists interested in understanding learning in the brain should focus on developing theories of biological learning that are based on good solutions to the online scenarios, without assuming that solutions to the offline scenario will work well too.

Again, backpropagation, which implements SGD, is not clearly the best solution to online learning in deep neural networks. It is currently, in practice, the best solution we have for offline learning, but again SGD is slow and therefore may be non-ideal for online scenarios. There is thus some reason to look to other optimization procedures and algorithms as possible explanations for what the brain is doing.

Online Learning as a Standard Approach in Neuromorphic Computing

Neuromorphic chips are brain-inspired hardware that are highly energy efficient. They typically implement spiking neural networks and have similar properties as the brain. These chips are ideal for robots and embedded sensors interacting with the real world that have tight energy constraints (e.g., finite battery power). Notice these are the same scenarios where online learning would be especially useful, i.e., an embedded system interacting with the real world in real time. Thus, there is motivation from a neuromorphic computing point of view to think hard about the online learning problem. Much work on spiking networks, it seems to me, is focused on adapting backprop to spiking networks. However, for reason discussed above, assuming backpropagation is the ideal/gold-standard solution for online learning to here is not obvious. It may therefore be useful to look to other optimization methods or to develop new ones specifically for online learning on neuromorphic chips.

References

Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., & Singer, Y. (2006). Online passive aggressive algorithms.

Daniely, A., Gonen, A., & Shalev-Shwartz, S. (2015, June). Strongly adaptive online learning. In International Conference on Machine Learning (pp. 1405-1411). PMLR.

Hadsell, R., Rao, D., Rusu, A. A., & Pascanu, R. (2020). Embracing change: Continual learning in deep neural networks. Trends in cognitive sciences24(12), 1028-1040.

Hayes, T. L., & Kanan, C. (2020). Lifelong machine learning with deep streaming linear discriminant analysis. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 220-221).

Hayes, T. L., & Kanan, C. (2022). Online Continual Learning for Embedded Devices. arXiv preprint arXiv:2203.10681.

Lu, J., Liu, A., Dong, F., Gu, F., Gama, J., & Zhang, G. (2018). Learning under concept drift: A review. IEEE Transactions on Knowledge and Data Engineering31(12), 2346-2363.

Neal, R. M., & Hinton, G. E. (1998). A view of the EM algorithm that justifies incremental, sparse, and other variants. In Learning in graphical models (pp. 355-368). Springer, Dordrecht.

Shalev-Shwartz, S. (2012). Online learning and online convex optimization. Foundations and Trends® in Machine Learning4(2), 107-194.

Zhong, S. (2005, July). Efficient online spherical k-means clustering. In Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005. (Vol. 5, pp. 3180-3185). IEEE.

Predictive Coding: A Brief Introduction and Review for Machine Learning Researchers

By Nick Alonso

Introduction

Predictive coding (PC), a popular neural network model used in neuroscience, has recently caught the attention of the machine learning community. A flurry of recent work has shown that PC and its synaptic update rules are to be able to train deep neural networks competitively with backpropagation (BP) and have interesting formal connections to backpropagation, expectation maximization, and other optimization methods. These results combined with the fact that PC uses local, biologically plausible learning rules makes PC of particular interest to the bio-inspired and neuromorphic computing communities. However, the recent literature on PC can be difficult to navigate for the machine learning researcher newly acquainted with PC, given the wide variety of names, terms, and variations that have emerged with this recent work. This brief review aims to 1) clarify the terminology around PC, 2) present a brief introduction to the PC approach to training deep networks, 3) review the recent theoretical work linking PC with various optimization methods and algorithms, and 4) discuss the pros and cons of PC compared to BP and future directions for its development.

This review will not discuss the neuroscience work behind PC. There already exist plenty of reviews on this subject. We provide a list of recommended classic papers and reviews below. For recent empirical work, we recommend Song et al. (2022) who discuss how the standard PC learning algorithm (called ‘inference learning’ below) differs from BP. They use this analysis to provide empirical support for the hypothesis that the brain learns in a way more similar to this standard PC algorithm than to BP.

Recommended Papers on Predictive Coding and the Free Energy Principle in Neuroscience

Rao, R. P., & Ballard, D. H. (1999). Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects. Nature neuroscience2(1), 79-87.

Friston, K. (2010). The free-energy principle: a unified brain theory?. Nature reviews neuroscience11(2), 127-138.

Bastos, A. M., Usrey, W. M., Adams, R. A., Mangun, G. R., Fries, P., & Friston, K. J. (2012). Canonical microcircuits for predictive coding. Neuron76(4), 695-711.

Keller, G. B., & Mrsic-Flogel, T. D. (2018). Predictive processing: a canonical cortical computation. Neuron100(2), 424-435.

Song, Y., Millidge, B. G., Salvatori, T., Lukasiewicz, T., Xu, Z., & Bogacz, R. (2022). Inferring Neural Activity Before Plasticity: A Foundation for Learning Beyond Backpropagation. bioRxiv.

Terminology

The term ‘predictive coding’ is sometimes used ambiguously machine learning literature. In what follows, it will be important to keep our definitions clear, so we first make a distinction between three components of a neural network: network architecture, learning algorithm, and optimization method. These concepts may seem straightforward to the machine learning researcher, but they are sometimes conflated in the PC literature.

  • Network Architecture: The set of equations defining how neuron states are computed and how neurons interact with each other via synaptic weights.
  • Optimization Method: A general technique for updating model parameters to minimize some loss function, e.g., stochastic gradient descent (SGD). Usually this technique describes in a simple way (one equation) how the loss relates to parameters updates.
  • Learning Algorithm: A step by step procedure for computing parameter updates, which ideally implements an optimization method with convergence guarantees (e.g., BP is an algorithm that implements SGD).

PC is sometimes referred to as an algorithm (e.g., ‘the predictive coding algorithm’) and other times referred to as an architecture (e.g., ‘predictive coding circuits’). Here we refer to PC as a kind of recurrent neural network architecture. Details of the PC architecture are described in the next section. Defining PC as an architecture allows us to distinguish between different learning algorithms that all utilize PC, and stays truer to the original neuroscience description of PC as a kind of circuit.

What Predictive Coding is and How it Works

Here we describe how PC works in standard multi-layered perceptron (MLP) architectures trained on supervised learning tasks, though it should be noted PC can also be used for self-supervised learning, as described below in the section on associative memory and self-supervised learning.

Consider a standard MLP architecture with neurons at layer l represented as a column vector h_l and weights W_l that propagate signals from layer l to l+1. Neuron feedforward (FF) activities are computed h_{l+1} = f(W_l h_l), where f is a non-linearity. Note, when we use h without a subscript, we refer to all neuron activities at all layers, and when we use \theta we are referring to all parameters in the model. At each training iteration t, input data x^t and output prediction target y^t are given to the network. (One can also think of these as mini-batches of data-points and targets, it does not change the algorithm described below). The network is tasked with minimizing some measure of loss, \mathcal{L}(y^t, h_L^t), where h_L^t are the output layer FF activities.

There are several learning algorithms that use PC which are able to minimize the loss, \mathcal{L}(y^t, h_L^t), produced by the feedforward pass. Here we describe the standard learning algorithm that uses PC, sometimes referred to as inference learning (IL). IL can be described as a kind of energy-based algorithm that proceeds in two steps, both of which minimize a quantity known as free energy (F). We first describe the steps, then we describe the energy:

Inference Learning Algorithm (Informal)

  1. (Inference Phase) Minimize/Reduce F w.r.t. neuron activities, while holding fixed weights.
  2. (Weight Update) Minimize/Reduce F w.r.t. synaptic weights, while holding fixed neuron activities.

We write ‘minimize/reduce’ because in practice F is typically not fully minimized in either step. The first step is sometimes called the inference phase, since it can be interpreted as performing approximate Bayesian inference. see section on expectation maximization for brief discussion of this interpretation. For details Bogacz (2017).

What is free energy? Since we are altering the feedforward activities in step 1, we need new notation to refer to altered\optimized activities. We will call the altered\optimized activities computed in step 1 as \hat{h}. One description of free energy, used in Alonso (2022), applied to standard MLP architectures is

F(\hat{h}, \theta) = \mathcal{L}(y^t, \hat{h}_{L}) + \sum_l^{L-1} \frac12 \Vert \hat{h}_{l+1} - f(W_l \hat{h}_l) \Vert^2 + \gamma^{decay} \sum_l^{L} \frac12 \Vert \hat{h}_l \Vert^2.

In words, free energy, F, is a positive scalar that depends on the loss between target activities y^t and \hat{h}_L, the difference between \hat{h}_{l+1} and the previous layer’s output f( W_l \hat{h}_l), and the magnitude of the activities. (Note it is also possible to use W_l f(\hat{h}_l) instead of f(W_l  \hat{h}_l)).

It is common in practice to ‘fully clamp’ output layer activities such that \hat{h}_L = y^t and \mathcal{L}(y^t, \hat{h}_L^t) = 0 and to ignore the decay term (set \gamma^{decay}=0). Further, it is standard to describe the outputs f(W_l \hat{h}_l) as predictions of \hat{h}_{l+1}. Let’s therefore define the prediction at layer l+1 as p _{l+1} = f(W_l  \hat{h}_l). Under these conditions, F simplifies to

F(\hat{h}, \theta) = \sum_{l=1}^{L} \frac12 \Vert \hat{h}_{l} - p_{l} \Vert^2 = \sum_{l=1}^{L} \Vert e_{l} \Vert^2

In this case, F is just the summation over squared prediction errors, e_l, at each layer l. For a detailed probabilistic interpretation of F see Bogacz (2017).

A formal description of IL can now be expressed as follows:

Inference Learning Algorithm (Formal)

  1. \hat{h}^t \approx \text{argmin}_{\hat{h}} F(\hat{h}, \theta^t)
  2. \theta^{t+1} \approx \text{argmin}_{\theta} F(\hat{h}^t, \theta)

Where does PC fit into this algorithm?

  • PC refers to the process of iteratively updating neuron activities to reduce local squared prediction errors.

By ‘local’, we mean that neurons at layer l are only updated to reduced the squared prediction errors at layer l and l+1. Typically, though not always, these updates are gradients of the local errors w.r.t. neuron activities and can be interpreted as recurrent neural network computations. Again, the term ‘predictive coding’ is used a bit loosely in the literature, but we think this captures what neuroscientists and much of the machine learning community working on these algorithms are referring to when they use the term ‘predictive coding’. Formally, here is how PC updates neuron activities using gradients of the local errors:

\hat{h}_l = \hat{h}_l - \gamma \frac{\partial F}{\partial \hat{h}_l} = \hat{h}_l + \gamma (W_l^{\top}f'(W_l\hat{h}_l) e_{l+1} - e_{l}),

where \gamma is the step size and f’ is the gradient of the non-linearity. We emphasize that these are only partial gradients, since these updates only use gradients of errors at layers l and l+1, and ignore error gradients from other layers. We can see that this looks like a sort of recurrent neural network computation where predictions are propagated in one direction, then prediction errors are propagated back to the prediction generating neurons using the weight transpose. (This transpose may also be replaced with a separate matrix that learns to approximate the transpose).

In practice, each training iteration, neuron activities are updated with multiple gradient updates (typically about 10-25) so F is sufficiently reduced. Afterward, weights are usually updated with a partial gradient step over weights:

W_l^{t+1} = W^{t} -  \alpha \frac{\partial F}{\partial W_l} = W^{t} +  \alpha e_{l+1}f'(W_l\hat{h}_l)\hat{h}_l^{\top},

where \alpha is the step size. Again, these updates only use local error information: W_l is updated with gradient of the squared error at layer l+1. In the case where mini-batches are used, these updates are summed or averaged over the mini-batch. If we assume there exist error neurons encoding e_{l+1} this update is a Hebbian-like learning rule that is the outer product of presynaptic activities \hat{h}_l and post-synaptic error neuron activities, e_{l+1}, with local modulation f'(W_l\hat{h}_l).

One last thing to note. When using gradient descent to update neuron activities, it seems necessary, in practice, to initialize activities to the FF activities. That is, initialize \hat{h} to h. Taking this into account, we can now describe how PC is used to train MLPs:

Inference Learning via Predictive Coding (IL-PC)

  1. Initialize activities to feedforward activities, \hat{h} = h
  2. Reduce F w.r.t. \hat{h} using gradient descent/PC, while holding fixed \theta.
  3. Reduce F w.r.t. \theta using a gradient step, while holding fixed \hat{h}.

Those familiar with backpropagation (BP) will realize IL has some dissimilarities from BP. For example, unlike BP, IL does not store and use feed-forward activities, h, to compute weight updates. This raises a question: why is IL-PC able to minimize the loss, which is computed using feed-foward activities, \mathcal{L}(y^t, h_L^t), given it does not use these activities to compute weight updates? One can get an intuition for how this works by looking at the figure below. We discuss this and related questions in more detail in the following sections.

Another Recommended Tutorial

Bogacz, R. (2017). A tutorial on the free-energy framework for modelling perception and learning. Journal of mathematical psychology76, 198-211.

Predictive Coding, Backprop, and Stochastic Gradient Descent

Whittington and Bogacz (2017) were the first to show that IL-PC could be used to train MLPs on classification tasks, which raises the question of why IL-PC is able to minimize a loss. Whittington and Bogacz (2017) provided some useful insight into this question by showing that IL-PC updates become increasingly similar to BP updates in the limit where altered/optimized activities approach FF activities, i.e., \hat{h} \rightarrow h. This suggests IL-PC is able to reduce the loss because it is approximating BP and thus stochastic gradient descent (SGD). However, with a fully clamped output layer, as used by Whittington and Bogacz, this limit is only approximated late in training when the loss is small and activities only need to be altered slightly to reduce local errors. Early in training, however, when the loss is large, this approximation is worse. Yet IL-PC reduces the loss in a stable manner early in training, which raises the question of whether there is a better description of how IL-PC is reducing the loss. Indeed, recent theoretical works have emphasized the differences between IL-PC and BP/SGD, which we discuss more below (e.g., see Song et al. (2022), Millidge et al. (2022), and Alonso et al. (2022)).

A related line of research has looked to alter the IL-PC algorithm to better approximate BP. Song et al. (2020) presented the Z-IL algorithm, which is equivalent to BP. This equivalence is achieved by updating weights at a specific step during the inference phase using a specific step size. Z-IL was later shown by Salvatori et al. (2021) to also yield this equivalence in convolutional networks. Millidge et al. (2020) developed an algorithm called activation relaxation which very closely approximates BP. This algorithm stores initial FF activities at hidden layers so they could be used during weight updates. Millidge et al. (2022) later showed that softly clamping the output layer so activities are only slightly nudged from their initial activities yields a better approximation of BP than fully clamping. It should be emphasized that all of these algorithms use the same/similar PC procedure for updating neuron activities (same inference phase). The only differences are the rules for updating the weights. An interesting property of these algorithms is that they provide a way of performing SGD using local learning rules. However, these variants do not clearly improve performance over standard IL-PC. Further, some of these alterations may be seen as biologically implausible.

References and Other Recommended Papers on Relation between PC and BP

Whittington, J. C., & Bogacz, R. (2017). An approximation of the error backpropagation algorithm in a predictive coding network with local hebbian synaptic plasticity. Neural computation29(5), 1229-1262.

Whittington, J. C., & Bogacz, R. (2019). Theories of error back-propagation in the brain. Trends in cognitive sciences23(3), 235-250.

Song, Y., Lukasiewicz, T., Xu, Z., & Bogacz, R. (2020). Can the Brain Do Backpropagation?—Exact Implementation of Backpropagation in Predictive Coding Networks. Advances in neural information processing systems33, 22566-22579.

Millidge, B., Tschantz, A., Seth, A. K., & Buckley, C. L. (2020). Activation relaxation: A local dynamical approximation to backpropagation in the brain. arXiv preprint arXiv:2009.05359.

Salvatori, T., Song, Y., Lukasiewicz, T., Bogacz, R., & Xu, Z. (2021). Predictive coding can do exact backpropagation on convolutional and recurrent neural networks. arXiv preprint arXiv:2103.03725.

Millidge, B., Song, Y., Salvatori, T., Lukasiewicz, T., & Bogacz, R. (2022). Backpropagation at the Infinitesimal Inference Limit of Energy-Based Models: Unifying Predictive Coding, Equilibrium Propagation, and Contrastive Hebbian Learning. arXiv preprint arXiv:2206.02629.

Millidge, B., Salvatori, T., Song, Y., Bogacz, R., & Lukasiewicz, T. (2022). Predictive Coding: Towards a Future of Deep Learning beyond Backpropagation?. arXiv preprint arXiv:2202.09467.

Alonso, N., Millidge, B., Krichmar, J., & Neftci, E. (2022). A Theoretical Framework for Inference Learning. arXiv preprint arXiv:2206.00164.

Millidge, B., Song, Y., Salvatori, T., Lukasiewicz, T., & Bogacz, R. (2022). A Theoretical Framework for Inference and Learning in Predictive Coding Networks. arXiv preprint arXiv:2207.12316.

Song, Y., Millidge, B. G., Salvatori, T., Lukasiewicz, T., Xu, Z., & Bogacz, R. (2022). Inferring Neural Activity Before Plasticity: A Foundation for Learning Beyond Backpropagation. bioRxiv.

Predictive Coding and Expectation Maximization

Those familiar with expectation maximization (EM) may have noticed EM looks quite similar to IL. EM is a standard algorithm used train probabilistic generative models, which are composed of parameters \theta and hidden variables h. EM proceeds in two steps: first, the ‘E-step’ is performed, which computes the posterior of P(h|x, \theta), given the parameters and the data x. Then the ‘M-step’ is performed, which updates \theta holding fixed h to maximize the joint probability P(h, x| \theta). This same procedure is equivalent to another two step algorithm, where one first minimizes a quantity known as variational free energy w.r.t. h holding fixed \theta, then minimizes the same quantity w.r.t. \theta holding fixed h (see Neal and Hinton, 1998).

The energy-based version of EM sounds just like IL-PC. This is interesting given that IL-PC was created by Rao and Ballard (1999) in a neuroscience context, where they did not derive IL-PC from EM and made no mention of EM in their original paper. The similarities between IL and EM have been pointed out before, e.g., see Millidge et al. (2021) and Marino (2022). However, it was only recently shown under what assumptions IL is equivalent to EM by Millidge et al. (2022). Activities are assumed to represent the mean of Delta distributions, which act as the sufficient statistics of the approximate posterior distribution. The posterior means are computed in the E-step using, what is called variational inference, which approximates the posterior using some optimization method to reduce free energy (for tutorial on variational inference see Blei et al (2017), and Bogacz (2017)). Synapses encode the parameters of the generative model. This interpretation offers an alternative to the BP interpretation as to why IL-PC is able to minimize a loss, since EM has guarantees to converge to a minimum (or saddle point) of the log-likelihood of the data (Dempster (1977), Neal and Hinton (1998)).

References and Other Recommended Papers on PC and EM

Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological)39(1), 1-22.

Neal, R. M., & Hinton, G. E. (1998). A view of the EM algorithm that justifies incremental, sparse, and other variants. In Learning in graphical models (pp. 355-368). Springer, Dordrecht.

Blei, D. M., Kucukelbir, A., & McAuliffe, J. D. (2017). Variational inference: A review for statisticians. Journal of the American statistical Association112(518), 859-877.

Bogacz, R. (2017). A tutorial on the free-energy framework for modelling perception and learning. Journal of mathematical psychology76, 198-211.

Millidge, B., Seth, A., & Buckley, C. L. (2021). Predictive coding: a theoretical and experimental review. arXiv preprint arXiv:2107.12979.

Marino, J. (2022). Predictive coding, variational autoencoders, and biological connections. Neural Computation34(1), 1-44.

Millidge, B., Song, Y., Salvatori, T., Lukasiewicz, T., & Bogacz, R. (2022). A Theoretical Framework for Inference and Learning in Predictive Coding Networks. arXiv preprint arXiv:2207.12316.

Salvatori, T., Song, Y., Millidge, B., Xu, Z., Sha, L., Emde, C., … & Lukasiewicz, T. (2022). Incremental Predictive Coding: A Parallel and Fully Automatic Learning Algorithm. arXiv preprint arXiv:2212.00720.

Predictive Coding and Implicit Gradient Descent

The EM interpretation provides an algorithmic description of IL-PC that is distinct from the BP interpretation. However, the EM interpretation alone does not provide a concise\clear description of how parameters are moving through parameter space. For example, is the variant of EM that IL-PC implements just another way of implementing/approximating SGD? Or should we interpret the algorithm as implementing some other optimization method, i.e., is it using some other strategy to move parameters through parameter space to find local minima?

Some progress on this questions is made by Alonso et al. (2022), who showed the IL algorithm used in standard MLP architectures closely approximates an optimization method known as implicit stochastic gradient descent (implicit SGD). Importantly, implicit SGD is distinct from the standard SGD that BP implements. We call standard SGD explicit SGD, for reasons we now explain. Here’s is a general mathematical description of explicit and implicit SGD:

Explicit SGD: \theta^{(t+1)} = \theta^{(t)} - \alpha \frac{\partial \mathcal{L}(\theta^{(t)})}{\partial \theta^{(t)}}

Implicit SGD: \theta^{(t+1)} = \theta^{(t)} - \alpha \frac{\partial \mathcal{L}(\theta^{(t+1)})}{\partial \theta^{(t+1)}} = argmin_{\theta} \mathcal{L}(\theta) + \frac{1}{2\alpha}\Vert \theta - \theta^{(t)} \Vert^2

Explicit SGD takes a gradient step over the parameters with some step size \alpha. This gradient can be explicitly computed given the known values at the current training iteration, t. Hence, explicit gradient descent. Implicit SGD, on the other hand, takes a gradient step on \theta^t, where the gradient is computed using the parameters at the next training iteration, t+1. Generally, this gradient cannot be readily computed given known values at the current iteration t (e.g., \theta^{t+1} not known). However, it turns out the implicit SGD update is equivalent to the output of an optimization process known as the proximal operator, shown on the right hand side of the implicit SGD equation. The proximal operator sets the new parameters equal to the parameters that both minimize the loss, \mathcal{L}, and the norm of the change in parameters. It finds the parameters that best minimize the loss while remaining in the proximity of the current parameters. Therefore, the parameters updated with implicit SGD can be computed implicitly in terms of this optimization process/proximal operator.

How does IL approximate implicit SGD? Here’s the rough intuition: minimizing F w.r.t. activities means clamping the output layer, reducing \Vert e \Vert^2, and reducing \Vert \hat{h}\Vert^2(via the decay term). Fully or softly clamping the output layer activities and updating the weights to reduce local errors will yield a new input/output mapping that reduces the loss given the same input, which is shown intuitively in the figure above. Second, reducing \Vert e \Vert^2 and \Vert \hat{h}\Vert^2 reduces the magnitude of weight updates, since weight updates are the outer product of e and \hat{h}. Thus, minimizing F w.r.t. neuron activities can be seen as finding target activities that will both reduce \mathcal{L}(y^t, h_L) and reduce \Vert \theta - \theta^t \Vert^2, just like implicit SGD. Importantly, there are differences between explicit and implicit SGD. For example, they typically do not move parameters through parameter space along the same path, and implicit SGD is far less sensitive to learning rate than explicit SGD. For details see the paper and/or summary referenced below.

Reference

Alonso, N., Millidge, B., Krichmar, J., & Neftci, E. (2022). A Theoretical Framework for Inference Learning. arXiv preprint arXiv:2206.00164.

Alonso, N. (2022) Summary: ‘A Theoretical Framework for Inference Learning’. https://neuralnetnick.com/2022/12/19/summary-a-theoretical-framework-for-inference-learning

Predictive Coding, Self-supervised Learning, and Auto-associative Memory

So far we have considered standard MLP architectures that attempt to map input x^{t} to target output y^t. However, PC was originally used to model self-supervised learning in the brain. In these tasks, the model is provided with some datapoint x^{t}, and, similar to a hierarchical autoencoder, must compute hidden representations h_0, h_1,...,h_L that generate predictions/reconstructions of the data and other hidden representations. In probabilistic terms, we want to learn a generative model P(x, h), that we can sample from and use to perform inference. Unlike more standard achitectures for this task, like variational autoencoders, PC networks do not use an encoder that directly maps x \rightarrow h. Instead PC networks compute h using the the recurrent processing (i.e., gradient descent) described above, which minimizes local prediction errors throughout the network. Recent work has shown that PC models are able to learn effective generative models, that perform inference and produce samples similar, and sometimes slightly better, in quality than variational autoencoders trained with BP on natural images (e.g., Ororbia & Kifer, 2022, Ororbia & Mali, 2022).

Another sort of self-supervised task is the auto-associative memory task. For this task, a model is trained in a self-supervised manner to input, represent, and reconstruct input data. However, during test time, corrupted versions of the training data, \widetilde{x}, are presented where noise is added to the data or certain elements of the data are removed/set to 0. The model must return the uncorrupted version of the data (i.e., denoise or fill in missing elements). That is, we want the model to be able to learn the mapping \widetilde{x}^t \rightarrow h \rightarrow x^t. Salvatori et al. (2021) recently showed that PC networks trained with IL can perform auto-associative tasks and outperform other standard auto-associative models like modern Hopfield networks, and PC networks are able do so on high-dimensional image vectors.

References

Ororbia, A., & Kifer, D. (2022). The neural coding framework for learning generative models. Nature communications13(1), 1-14.

Ororbia, A., & Mali, A. (2022). Convolutional Neural Generative Coding: Scaling Predictive Coding to Natural Images. arXiv preprint arXiv:2211.12047.

Salvatori, T., Song, Y., Hong, Y., Sha, L., Frieder, S., Xu, Z., … & Lukasiewicz, T. (2021). Associative memories via predictive coding. Advances in Neural Information Processing Systems34, 3874-3886.

Pros and Cons of IL-PC and Future Directions

Why should a machine learning practitioner or researcher care about IL-PC and its variants? Here we list some advantages and disadvantages of IL-PC compared to BP and possible future research directions.

Advantages:

  1. Unlike BP, weight updates performed by IL-PC are local in space and time: IL-PC uses update rules that are the product of presynaptic activities and post-synaptic error neuron activities. BP updates are not spatially local in this way, since error signals are propagated directly from the output layer. IL-PC updates are local in time since they use whatever activities and local errors are available at the time of the updates, whereas BP must store FF activities during the forward and backward pass through time. The locality of its updates suggests IL-PC may be more compatible with neuromorphic hardware than BP. However, it is not obvious how to convert the IL-PC algorithm here to spiking networks, which are often required on neuromorphic hardware, e.g., do we need to encode errors in spiking neurons? If so how? How should gradient updates on activities work given spikes are not differentiable? These questions are not necessarily problems, but they will require engineering work to solve.
  2. Work by Song et al. (2022), Alonso et al. (2022), and Salvatori (2021) suggest possible performance advantages of IL-PC over BP. For example, some improved performance, either in terms of training speed or loss at convergence, were observed under data constraints, small mini-batches, concept drift, and auto-association with large images. However, the extent of these advantages needs to be further explored and better established.

Disadvantages:

  1. The inference phase of the IL-PC algorithm is computationally expensive, since activities must be updated multiple times each training iteration. Future work could look to reduce the computational cost of this phase by altering the algorithm in various ways. The computational cost could possibly also be reduced with highly correlated data (e.g., video), since the network activities may not need to be reset and recomputed each iteration. This would need to be tested, however.
  2. IL-PC is trickier to implement than BP: there are standard libraries for automatically computing gradients, while PC-IL must be implemented by hand. Future work could attempt to find efficient implementations of PC-IL written in an open source library that can perform IL-PC ‘under the hood’ for arbitrary architectures.

Finally, we also believe that there is much potential to further explore IL-PC for self-supervised tasks. IL-PC, which as noted above is a special case of EM, is most useful for models where probabilistic inference is part of the task being performed, since an inference step (or E-step) is ‘built into’ the algorithm. When this inference phase is not used to perform the task, the inference phase/E-step adds significantly to the computational cost while only providing learning signals. The classification tasks considered above, for example, do not use the inference phase during test time to predict labels. It is only used during training to provide learning signals. Salvatori et al.’s and Orobia et al.’s work on auto-associative memory and self-supervised learning are good examples of a self-supervised task where the inference phase of IL-PC actually performs the task (i.e., the inference phase did the reconstruction, denoising, and filling-in of the input image). Applying IL-PC to self-supervised learning tasks also utilizes the algorithm for what it was originally designed for: brain-like self-supervised learning.

Summary: ‘A Theoretical Framework for Inference Learning’

I am excited to announce the paper ‘A Theoretical Framework for Inference Learning’ was recently excepted to the NeurIPS 2022 conference. It was authored by me (Nick Alonso), Beren Millidge, Jeff Krichmar, and Emre Neftci. The paper develops a theoretical/mathematical foundation for the inference learning algorithm, the standard algorithm used to train predictive coding (PC) models of biological neural circuits. This theoretical foundation could provide a basis for a theory of how the brain solves the credit assignment problem (explained below), and as a basis for further developing the algorithm for engineering applications. Here, I provide a brief intuitive summary of the mathematical/theoretical results, related simulation results, and their potential significance for neuroscience and engineering. A draft of the paper can be found on arxiv (here).

Introduction

How can a neural network, with multiple hidden layers, be trained to minimize some global loss function (where the loss is a measure of network performance)? This question is, roughly, what some refer to as the credit assignment problem for deep neural networks, as it involves determining the amount and direction (positive or negative) that network parameters affect the loss (i.e., ‘assigning credit’ to the parameters for how they affect the loss).

At least two things are needed for a satisfying solution to the credit assignment problem: First, is, what I will call, an optimization method, which is a general strategy for updating model parameters (synaptic weights, in the case of neural networks), which has guarantees of minimizing the loss function and converging to a local minimum of the loss. The optimization strategy also typically has a simple (one equation) description of how the loss relates to parameter updates, and in so doing provides an assignment of credit. Second, is an algorithmic implementation of the optimization strategy, which is a set of equations and step by step procedure (i.e., a program) which is used to compute the parameter updates.

Within the field of deep learning, the standard solution uses stochastic gradient descent (SGD) as its optimization method and the error backpropagation algorithm (BP) as its algorithmic implementation. The BP-SGD strategy works amazingly well in practice, however there are two issues facing BP-SGD. The first is a problem for neuroscience the second for neuromorphic engineering. 1) It is generally agreed that the brain is likely not doing BP. More biologically plausible algorithmic implementations of SGD have been developed. However, they still have some properties difficult to reconcile with neurobiology (discussed more below). 2) BP is difficult to implement in energy efficient, neuromorphic hardware, for similar reasons as to why it is biologically implausible. These issues motivate the search for more bio-compatible learning algorithms.

The inference learning algorithm (IL), which used to train popular predictive coding (PC) models of the brain, is one promising candidate. In 2017, James Whittington and Rafael Bogacz (see here) developed two important results concerning predictive coding networks and the inference learning algorithm (IL). First, they showed PC models, which are a kind of recurrent neural network, and the IL algorithm, can be used to train deep neural networks. Subsequent work has further established this on other datasets and tasks. Second, IL parameter updates approach those produced by BP in a certain limit.

However, the limit where IL approaches BP is a limit that is never fully realized in practice and is poorly approximated early in training. This essentially implies IL never equals BP in practice. This leaves us with some questions: Is IL just a rough/approximate implementation of SGD? Or might IL be implementing some other optimization strategy? Also, since IL is not doing BP exactly, in what ways does it differ from BP?

Main Result: Part 1

These were the questions explored in our paper ‘A Theoretical Framework for Inference Learning’. Our main result is that, under certain assumptions well approximated in practice, IL implements an optimization strategy known as implicit stochastic gradient descent (implicit SGD). Importantly, implicit SGD is distinct from the standard SGD that BP implements. We call standard SGD explicit SGD, for reasons we now explain. Here’s is a general mathematical description of explicit and implicit SGD:

Explicit SGD: \theta^{(t+1)} = \theta^{(t)} - \alpha \frac{\partial \mathcal{L}(\theta^{(t)})}{\partial \theta^{(t)}}

Implicit SGD: \theta^{(t+1)} = \theta^{(t)} - \alpha \frac{\partial \mathcal{L}(\theta^{(t+1)})}{\partial \theta^{(t+1)}} = argmin_{\theta} \mathcal{L}(\theta) + \frac{1}{2\alpha}\Vert \theta - \theta^{(t)} \Vert^2

Explicit SGD, which is implemented by BP, takes a gradient step over the parameters with some step size \alpha. This gradient can be readily and explicitly computed given the known values at the current training iteration, t. Hence, it is called explicit gradient descent. Implicit SGD, on the other hand, takes the gradient of the loss using the parameters at the next training iteration, t+1, and uses this gradient to update the parameters at the current training iteration, t. Generally, this gradient cannot be readily computed given known values at the current iteration t (e.g., the parameters at the next iteration are not known). However, it turns out the implicit SGD update is equivalent to the output of an optimization process known as the proximal operator, shown on the right hand side of the implicit SGD equation. The proximal operator sets the new parameters equal to the parameters that both minimize the loss, \mathcal{L}, and the norm of the change in parameters, i.e., it finds the parameters that best minimize the loss while remaining in the proximity of the current parameters. Hence, the implicit SGD update can be computed implicitly in terms of this optimization process/proximal operator.

(Note: those familiar with the explicit and implicit Euler methods from differential equations will notice that explicit and implicit gradient descent are analogous to these Euler methods.)

Main Results: Part 2

In what ways do explicit and implicit gradient descent (implemented via BP and IL respectively) behave differently? We studied this question in the paper through both theoretical results and simulations. Here’s a summary of the results:

Explicit SGD via BP

  1. Unstable/sensitive to learning rate.
  2. Follows steepest descent path of loss
  3. Slow Convergence
  4. Small Affect on Input/Output Mapping Relative to Update Magnitude

Implicit SGD via IL

  1. Stable\less sensitive to learning rate.
  2. Follows (approx.) minimum norm/shortest path toward minima
  3. Faster Convergence/Reduction of Loss
  4. Greater Affect on Input/Output Mapping Relative to Update Magnitude

First, IL is less sensitive to learning rate than BP. This is consistent with our theoretical interpretation, since is well known that explicit SGD is highly sensitive to learning rate, while implicit SGD is highly insensitive to learning rate. This insensitivity is due to the way the proximal operator uses the learning rate to only weight the two terms in the proximal objective function, rather than to scale the update, as explicit SGD does. Implicit SGD, in fact, has the property of unconditional stability, which means roughly that it will reduce the loss at the current iteration for essentially any positive learning rate.

Second, IL tends to take a more direct path toward local minima than BP, which is consistent with our theoretical interpretation. Explicit SGD follows the path of steepest descent each update. This implies, that if we imagine a non-convex (i.e., non-bowl shaped) loss landscape, explicit SGD will not take the most direct/shortest path toward local minima, but will often end up taking a longer, more round about way to find a minimum. This means parameters need to change more (i.e., long path length), in order to find a loss minimum. Implicit SGD, on the other hand, takes a shorter more direct path toward minima, since it aims to find the minimum norm (smallest parameter changes) needed to reduce loss significantly.

Third, shorter paths toward minima tend to mean quicker reduction of loss. We find this is especially true in the case of online learning where a single data-point is presented each iteration (see paper for a discussion of why this result may be more significant in online case).

Fourth, we find that IL updates tend to have a larger effect than BP on the input-output mapping of the network relative to their weight update magnitude. In this sense, IL updates are more impactful than BP updates. This is consistent with the implicit-SGD interpretation. The proximal operator can be interpreted as outputting the parameters that are most impactful, i.e., having largest impact on loss, and thus input-output mapping, relative to update magnitude.

Implications for Neuroscience

It is generally agreed BP is hard to reconcile with neurobiology (see [1], [2], [3] for good discussion). In response, some have developed alternatives to BP that exactly or approximately implement explicit SGD and better fit neurobiology. However, these more biologically inspired variants of BP still have some implausibilities. For example, in every BP alternative I have seen, feedback signals do not alter feedforward neuron activities (or if they do, performance worsens) (e.g., [4], [5]). This is difficult to reconcile with the fact that the brain is highly recurrently connected and local feedback and feedforward signals will essentially always interact with each other, either directly or indirectly through interneurons. IL, on the other hand, is implemented by predictive coding circuits, which are recurrent neural circuits where feedforward and feedback necessarily must interact (feedforward signals affect feedback signals and vice versa). Predictive coding circuits have also been mapped in highly detailed ways to cortical and subcortical circuits in the brain. There is still much debate about whether the brain is implementing PC widely, but it is clear the mechanics of IL avoid most of the biological implausibilities of BP and similar algorithms. This suggests a novel hypothesis concerning the optimization strategy used by the brain to solve the credit assignment problem: the brain solves the credit assignment problem through an implementation of implicit SGD, via an algorithm possibly similar to IL algorithm. The results above further characterize how learning in the brain should behave differently than BP/explicit SGD if this hypothesis were true, and thus suggests possible ways to test the two hypotheses.

Implications for Machine Learning

Neuromorphic hardware is computer hardware inspired by the brain. It typically uses spiking/binary networks, and is far more energy efficient than standard hardware. BP is hard to implement on neuromorphic hardware for several reasons. One of which is that BP uses weight updates that are not local in space and time. Like the brain, synaptic weight updates on neuromorphic hardware tend to require changes based on the activities of neurons at pre and post-synaptic layers (spatial locality) at a particular point in time (temporal locality). Exact BP requires storing feedforward activities through time and are thus not local in time, which cannot be done easily on neuromorphic hardware. IL, on the other hand, does not have to store neuron activities through time. It can essentially update synapses with whatever the pre and post-synaptic neuron activities are at just about any point in time during recurrent processing and still performs well. We did not implement a spiking version of IL, but if something akin to IL can be implemented in spiking networks, the weight updates it produces should be easier to implement on neuromorphic hardware.

Blog

In this blog, I discuss topics at the intersection of AI, neuroscience, and philosophy. Posts are listed below. I divide posts into two categories: technical and machine learning focused and less technical, philosophy focused. Within each category posts are listed in chronological order (from newest down to oldest). All opinions are my own.

Neuroscience, Machine Learning, and AI

  1. Towards Loss Functions that Better Track Intelligence
  2. Online Machine Learning: What it is and Why it Matters
  3. Predictive Coding: A Brief Introduction and Review for Machine Learning Researchers
  4. Summary: ‘A Theoretical Framework for Inference Learning’

Philosophy of Mind and AI

  1. Mary the Qualia-Blind Neuroscientist
  2. Why are Consciousness Researchers so Resistant to Illusionism?
  3. Consciousness, Illusionism, and the Existential Gap
  4. The Meta-Problem Test for AI Consciousness