4 Hierarchical inference and the recapitulating, self-evidencing, slowing brain

A system that obeys the free energy principle minimizes its free energy, or prediction error, on average and over time. It does this through perception, belief updating, action, attention, and model simplification. This gives us the outline of a very powerful explanatory mechanism for the mind. There is reason to think that much of this explanatory promise can be borne out (Clark 2013; Hohwy 2013).

This mechanism shapes and structures our phenomenology—it shapes our lived, experienced world. A good starting point for making good on this idea is the notion of hierarchical inference, which is a cornerstone of prediction error minimization.

Conceive of prediction error minimization as being played out between overlapping pairs of interacting levels of processing in the brain. A pair has a lower level receives input, and a higher level that generates predictions about the input at the lower level. Predictions are sent down (or “backwards”) where they attenuate as well as possible the input. Parts of the input it cannot attenuate are allowed to progress upwards, as prediction error. The prediction error serves as input to a new pair of levels, consisting of the old upper level, which is now functioning as lower input level, and a new upper level. This new pair of levels is then concerned with predicting the input that wasn’t predicted lower down. This layering can then go on, creating in the end a deep hierarchy in our brains (and perhaps a more shallow hierarchy in some other creatures). The messages that are passed around in the hierarchy are the sufficient statistics: predictions and prediction errors concerning (1) the means of probability distributions (or probability density functions) associated with various sensory attributes or causes of sensory input out there in the world, and (2) the precisions (the inverse of variance) of those distributions, which mediate the expected precisions mentioned above.

The hierarchy gives a deep and varied empirical Bayes or prediction error landscape, where prior probabilities are “empirical” in that they are learned and pulled down from higher levels, so they do not have to be extracted de novo from the current input. This reliance on higher levels means that processing at one level depends on processing at higher levels. Such priors higher up are called hyperparameters, for expectations of means, and hyperpriors for expectations of precisions.

The key characteristics of the hierarchy are time and space. Low levels of the hierarchy deal with expectations at fast timescales and relatively small receptive fields, while higher levels deal with expectations at progressively slower timescales and wider receptive fields. That is, different levels of the hierarchy deal with regularities in nature that unfold over different spatiotemporal scales. This gives a trade-off between detail and time horizon such that low down in the hierarchy, sensory attributes can be predicted in great detail but not very far into the future, and higher in the hierarchy things can be predicted further into the future but in less detail. This is essential to inference because different causal regularities in nature, working at different time scales, influence each other and thereby create non-linearities in the sensory input. Without such interactions, sensory input would be linear and fairly easy to predict both in detail and far into the future. So the temporal organization of the hierarchy reflects the causal order of the environment as well as the way the causes in the world interact with each other to produce the flow of sensory input that brains try to predict.

The structure of the hierarchy in the brain, and thereby the shape of the inferences performed in the course of minimizing prediction error, must therefore mimic the causal order of the world. This is one reason why hierarchical inference determines the shape and structure of phenomenology, at least to the extent that phenomenology is representational. The way inference is put together in the brain recapitulates the causes we represent in perception. Moreover, this is done in an integrated fashion, where different sensory attributes are bound together under longer-term regularities (for example, the voice and the mouth are bound together under a longer-term expectation about the spatial trajectories of people). This immediately speaks to long-standing debates in cognitive science, concerning for example the binding problem and cognitive penetrability (for which see Chs. 5-6 in Hohwy 2013). Though there is, of course, much more to say about how prediction error minimization relates to phenomenology, so far this suggests that there is some reason to think the austere prediction error minimization machine can bear out its explanatory promise in this regard.

Goals and actions are also embodied in the cortical hierarchy. Goals are expectations of which states to occupy. Actions ensue, as described above, when those expected states, which may be represented at relatively long timescales, can confidently be translated into policies for concrete actions fulfilled by the body. There are some thorny questions about what these goals might be and how they are shaped. One very fundamental story says that our expected states are determined by what it takes to maintain homeostasis. We are creatures who are able to harness vast and deep aspects of the environment in order to avoid surprising departures from homeostasis; though this opportunity comes with the requirement to harbor an internal model of the environment. Reward, here, is then the absence of prediction error, which is controlled by using action to move around in the environment, so as to maintain homeostasis on average and in the long run.

Taking a very general perspective, the brain is then engaged in maintaining homeostasis, and it does so by minimizing its free energy, or prediction error. Minimization of prediction error entails building up and shaping a model of the environment. The idea here is very simple. The better the model is at minimizing prediction error the more information it must be carrying about the true causes of its sensory input. This means that the brain does its job by recapitulating the causal structure of the world—by explaining away prediction error, the brain is essentially becomes a deeply structured mirror of the world. This representational perspective is entailed by the brain’s efforts to maintain itself in a low entropy or free energy state. This means that we should not understand the brain as first and foremost in the business of representing the world, such that it can act upon it—which may be an orthodox way of thinking about what the brain does. Put differently, the brain is not selected for its prowess in representation per se but rather for its ability to minimize free energy. Even though this means representation is not foundational in our explanation of the brain, it doesn’t mean that representation is sidelined. This is because we don’t understand what free energy minimization is unless we understand that it entails representation of the world. (This formulation raises the issue of the possibility of misrepresentation in prediction error minimization, for discussion see Hohwy 2013, Chs. 7-8.)

The brain can be seen, then, as an organ that minimizes its free energy or prediction error relative to a model of the world and its own expected states. It actively changes itself and actively seeks out expected sensory input in an attempt to minimize prediction error. This means the brain seeks to expose itself to input that it can explain away. If it encounters a change in sensory input that it cannot explain away, then this is evidence that it is straying from its expected states. Of course, the more it strays from its expected states, the more we should expect it to cease to exist. Put differently, the brain should enslave action to seek out evidence it can explain away because the more it does so, the more it will have found evidence for its own existence. The very occurrence of sensory input that its model can explain away becomes an essential part of the evidential basis for the model. This means the brain is self-evidencing (Hohwy 2014), in that the more input it can explain away, the more it gains evidence for the correctness of the model and thereby for its own existence.

The notions of recapitulation of the world and of self-evidencing can be captured in an exceedingly simple idea. The brain maintains its own integrity in the onslaught of sensory input by slowing down and controlling the causal transition of the input through itself. If it had no means to slow down the input its states would be at the mercy of the world and would disperse quickly. To illustrate, a good dam-builder must slow down the inflow of water by slowing down and controlling it with a good system of dams, channels, and locks. This dam system must in some sense anticipate the flows of water in a way that makes sense in the long run and that manages flows well on average. The system will do this by minimizing “flow errors”, and it and its dynamics will thereby carry information about—recapitulate—the states of water flow in the world on the other side of the dam. In general, it seems any system that is able to slow the flow of causes acting upon it must be minimizing its own free energy and thereby be both recapitulating the causes and self-evidencing (Friston 2013).

With these extremely challenging and abstract ideas, the brain is cast as an organ that does one thing only: minimize free energy and thereby provide evidence for its own existence. Just as the heart can change its beat in response to internal and external changes, the brain can change its own states to manage self-evidencing according to circumstances: perceive, act, attend, simplify. The weighting between these ways of minimizing prediction error is determined by the context. For example, it may be that learning is required before action is predicted to be efficient, so perception produces a narrow prediction error bound on surprise before action sets in, conditional on expected precisions; or perhaps reliable action is not possible (which may happen at night when sensory input is so uncertain that it cannot be trusted) and therefore the brain simplifies its own model parameters, which may be what happens during sleep (Hobson & Friston 2012).

This is all extremely reductionist, in the unificatory sense, since it leaves no other job for the brain to do than minimize free energy—so that everything mental must come down to this principle. It is also reductionist in the metaphysical sense, since it means that other types of descriptions of mental processes must all come down to the way neurons manage to slow sensory input.

The next sections turn to the question of whether this extreme explanatory and reductionist theory is not only controversial and ambitious but also preposterous.