8 How literally is the brain Bayesian?

Bayes’ rule is difficult to learn and takes considerable conscious effort to master. Moreover, we seem to flout it with disturbing regularity (Kahneman et al. 1982). So it is somewhat hard to believe that the brain unconsciously follows Bayes’ rule. This raises questions about how literally we should think of the brain as a Bayesian hypothesis-tester. In blog correspondence, Lisa Bortolotti put the question succinctly:

Acknowledging that prior beliefs have a role in perceptual inference, do we need to endorse the view that the way in which they constrain inference is dictated by Bayes’ rule? Isn’t it serendipitous that something we came up with to account for the rationality of updating beliefs is actually the way in which our brain unconsciously works?

Part of the beauty of the free energy principle is that even though it begins with the simple idea of an organism that acts to stay within expected states, its mathematical formulation forces Bayesian inference into the picture. Expected states are those with low surprisal or self-information. That is they have high probability given the model (low negative log probability). These states cannot be estimated directly because that would require already knowing the distribution of states one can be in. Instead it is estimated indirectly, which is where the free energy comes in. Free energy, as mentioned above, is equal to the surprisal plus the divergence between the probability of the hypothesis currently entertained by the brain’s states and the true posterior of the hypothesis given the model and the state. This much follows from Bayes’ rule itself. This means that if the brain is able to minimize the divergence, then the chosen hypothesis becomes the posterior. This is the crucial step, because a process that takes in evidence, given a prior, and ends up with the posterior probability, as dictated by Bayes, must at least implicitly be performing inference (Friston 2010).

Hence, if the free energy principle is correct, then the brain must be Bayesian. How should this be understood? Consider what happens as the divergence is minimized. Formally this is a Kullback-Leibler divergence (or cross entropy), which measures the dissimilarity between two probability distributions. The KL-divergence can be minimized with various minimization schemes, such as variational Bayes. This plays an important role in machine learning and is used in simulations of cognitive phenomena using the free energy principle. Given the detail and breadth of such simulations, it is not unreasonable to say that brain activity and behavior are describable using such formal methods.

The brain itself does not, of course, know the complex differential equations that implement variational Bayes. Instead its own activity is brought to match (and thereby slow down) the occurrence of its sensory input. This is sufficient to bring the two probability distributions closer because it can only do this if it is in fact minimizing prediction error. This gives a mechanistic realization of the hierarchical, variational Bayes. The brain is Bayesian, then, in the sense that its machinery implements Bayes not serendipitously but necessarily, if it is able to maintain itself in its expected states. (There is discussion within the philosophy of neuroscience about what it means for explanations to be computational. See papers by Piccinini 2006, Kaplan 2011, Piccinini & Scarantino 2011, Chirimuuta 2014.)

The notion of realization (or implementation, or constitution) is itself subject to considerable philosophical debate. A paradigmatic reading describes it in terms of what plays functional roles. Thus a smoke alarm can be described in terms of its functional role (i.e., what it, given its internal states, does, given a certain input). The alarm has certain kinds of mechanisms, which realize this role. This mechanism may comprise radioactive ions that react to smoke and causes the alarm to sound. The analogy between the smoke alarm and the brain seems accurate enough to warrant the paradigmatic functionalist reading of the way neuronal circuitry implements free energy minimization and therefore Bayes. Perhaps it is in some sense a moot point whether the ions in the smoke alarm “detect smoke” or whether they should merely be described in terms of the physical reactions that happen when they come into contact with the smoke particles. Rather than enter this debate it seems better to return to the point made at the start, when the brain was compared to other organs such as the heart. Here the point was that it is wrong to retract the description of the heart as a blood pump when we are told that no part of the cardiac cells are themselves pumps. The brain is literally Bayesian in much the same sense as the heart is literally a pump.

Behind this conceptual point is a deeper point about what kind of theory the free energy principle gives rise to (the following discussion will be based on Hohwy 2014). As described above, the Bayesian brain is entailed by the free energy principle. Denying the Bayesian brain then requires denying the free energy principle and the very idea of the predictive mind. This is, of course, a possible position that one could hold. One way of holding it is to “go down a level” such that instead of unifying everything under the free energy principle, theories just describe the dynamical causal interactions between brain and world. This would correspond to focusing more on systematic elements in the realization than in the function (looking at causal interaction between the heart and other parts of the body, and the individual dynamics of the cells making up the heart, rather than understanding these in the light of the heart being a pump). Call this the “causal commerce” position on the brain. Given the extensive and crucial nature of causal commerce between the brain and the world, this is in many ways a reasonable strategy. It seems fair to characterize parts of the enactive cognition position on cognitive science as informed primarily by the causal commerce position (for a comprehensive account of this position, see Thompson 2007; for an account that brings the debate closer to the free energy principle, see Orlandi 2013).

From this perspective, the choice between purely enactive approaches and inferential, Bayesian approaches becomes methodological and explanatory. One key question is what is accomplished by re-describing the causal commerce position from the more unified perspective of the free energy principle. It seems that more principled, integrated accounts of perception, action, and attention then become available. The more unified positioin also seems to pull away from many of the lessons of the enactive approach to cognition, because the free energy principle operates with a strict inferential veil between mind and world—namely the sensory evidence behind which hidden causes lurk, which must be inferred by the brain. Traditionally, this picture is anathema to the enactive, embodied approaches, as it lends itself to various forms of Cartesian skepticism, which signals an internalist, secluded conception of mind. A major challenge in cognitive science is therefore to square these two approaches: the dynamical nature of causal commerce between world, body, and brain and the inferential free energy principle that allows their unification in one account. On the approach advocated here, modulo enough empirical evidence, denying that the free energy principle describes the brain is on a par with denying that the heart is a pump. This means that it is not really an option to deny that the brain is inferential. This leaves open only the question of how it is inferential.

One line of resistance to subsuming everything under the free energy principle has to do with intellectualist connotations of Bayes. Somehow the idea of the Bayesian brain seems to deliver a too regularized, sequential, mathematical desert landscape—it is like a picture of a serene, computational mechanism silently taking in data, passing messages up and down the hierarchy, and spitting out posterior probabilities. This seems to be rather far from the somewhat tangled mess observed when neuroscientists look at how the brain is in fact wired up. In one sense this desert landscape is of course the true picture that comes with the free energy principle, but there need be nothing serene or regularized about the way it is realized. The reason for this goes to the very heart of what the free energy principle is. The principle entails that the brain recapitulates the causal structure of the world. So what we should expect to find in the brain will have to be approximating the far-from-serene and regularized interactions that occur between worldly causes. Just as there are non-linearly interacting causes in the world there will be convolving of causes in the brain; and just as there are localized, relatively insulated causal “eddies” in the world there will be modularized parameter spaces in the brain.

Moreover, there is reason to think the brain utilizes the fact that the same causes are associated with multiple effects on our senses and therefore builds up partial models of the sensorium. This corresponds to cognitive modules and sensory modalities allowing processing in conditionally independent processing streams, which greatly enhances the certainty of probabilistic inference. In this sense the brain is not only like a scientist testing hypotheses, but is also like a courtroom calling different, independent witnesses. The courtroom analogy is worth pursuing in its own right (Hohwy 2013), but for present purposes it supports the suggestion that when we look at the actual processing of the brain we should expect a fairly messy tangle of processing streams. (Clark 2013 does much to characterize and avoid this desert landscape but seems to do so by softening the grip of the free energy principle.)