3 Putting predictive processing, body, and world together again

An important feature of the full PP account (see Friston 2009; Hohwy 2013; Clark in press) is that the impact of specific prediction error signals can be systematically varied according to their estimated certainty or “precision”. The precision of a specific prediction error is its inverse variance—the size (if you like) of its error bars. Precision estimation thus has a kind of meta-representational feel, since we are, in effect, estimating the uncertainty of our own representations of the world. These ongoing (task and context-varying) estimates alter the weighting (the gain or volume, to use the standard auditory analogy) on select prediction error units, so as to increase the impact of task-relevant, reliable information. One key effect of this is to allow the brain to vary the balance between sensory inputs and prior expectations at different levels (see Friston 2009, p. 299) in ways sensitive to task and context.[11] High-precision prediction errors have greater gain, and thus play a larger role in driving processing and response. More generally, variable precision-weighting may be seen as the PP mechanism for implementing a wide range of attentional effects (see Feldman & Friston 2010).

Subtle applications of this strategy, as we shall shortly see, allow PP to nest simple (“quick and dirty”) solutions within the larger context of a fluid, re-configurable inner economy; an economy in which rich, knowledge-based strategies and fast, frugal solutions are now merely different expressions of a unified underlying web of processing. Within that web, changing ensembles of inner resources are repeatedly recruited, forming and dissolving in ways determined by external context, current needs, and (importantly) by flexible precision-weighting reflecting ongoing estimations of our own uncertainty. This process of inner recruitment is itself constantly modulated, courtesy of the complex circular causal dance of sensorimotor engagement, by the evolving state of the external environment. In this way (as I shall now argue) many key insights from work on embodiment and situated, world-exploiting action may be comfortably accommodated within the emerging PP framework.

3.1 Nesting simplicity within complexity

Consider the well-known “outfielder’s problem”: running to catch a fly ball in baseball. Giving perception its standard role, we might assume that the job of the visual system is to transduce information about the current position of the ball so as to allow a distinct “reasoning system” to project its future trajectory. Nature, however, seems to have found a more elegant and efficient solution. The solution, a version of which was first proposed in Chapman (1968), involves running in a way that seems to keep the ball moving at a constant speed through the visual field. As long as the fielder’s own movements cancel any apparent changes in the ball’s optical acceleration, she will end up in the location where the ball hits the ground. This solution, OAC (Optical Acceleration Cancellation), explains why fielders, when asked to stand still and simply predict where the ball will land, typically do rather badly. They are unable to predict the landing spot because OAC is a strategy that works by means of moment-by-moment self-corrections that, crucially, involve the agent’s own movements. The suggestion that we rely on such a strategy is also confirmed by some interesting virtual reality experiments in which the ball’s trajectory is suddenly altered in flight, in ways that could not happen in the real world—see Fink et al. 2009). OAC is a succinct case of fast, economical problem-solving. The canny use of data available in the optic flow enables the catcher to sidestep the need to deploy a rich inner model to calculate the forward trajectory of the ball.[12]

Such strategies are suggestive (see also Maturana & Varela 1980) of a very different role of the perceptual coupling itself. Instead of using sensing to get enough information inside, past the visual bottleneck, so as to allow the reasoning system to “throw away the world” and solve the problem wholly internally, such strategies use the sensor as an open conduit allowing environmental magnitudes to exert a constant influence on behavior. Sensing is here depicted as the opening of a channel, with successful whole-system behavior emerging when activity in this channel is kept within a certain range. In such cases:

[T]he focus shifts from accurately representing an environment to continuously engaging that environment with a body so as to stabilize appropriate co-ordinated patterns of behaviour. (Beer 2000, p. 97)

These focal shifts may be fluidly accommodated within the PP framework. To see how, recall that “precision weighting” alters the gain on specific prediction error units, and thus provides a means of systematically varying the relative influence of different neural populations. The most familiar role of such manipulations is to vary the balance of influence between bottom-up sensory information and top-down model-based expectation. But another important role is the implementation of fluid and flexible forms of large-scale “gating” among neural populations. This works because very low-precision prediction errors will have little or no influence upon ongoing processing, and will fail to recruit or nuance higher-level representations. Altering the distribution of precision weightings thus amounts, as we saw above, to altering the “simplest circuit diagram” (Aertsen & Preißl 1991) for current processing. When combined with the complex, cascading forms of influence made available by the apparatus of top-down prediction, the result is an inner processing economy that is (see Clark in press) “maximally context-sensitive”.

This suggests a new angle upon the outfielder’s problem. Here too, already-active neural predictions and simple, rapidly-processed perceptual cues must work together (if PP is correct) to determine a pattern of precision-weightings for different prediction-error signals. This creates a pattern of effective connectivity (a temporary distributed circuit) and, within that circuit, it sets the balance between top-down and bottom-up modes of influence. In the case at hand, however, efficiency demands selecting a circuit in which visual sensing is used to cancel the optical acceleration of the fly ball. This means giving high weighting to the prediction errors associated with cancelling the vertical acceleration of the ball’s optical projection, and (to put it bluntly) not caring very much about anything else. Apt precision weightings here function to select what to predict at any given moment. They may thus select a pre-learnt, fast, low-cost strategy for solving a problem, as task and context dictate. Contextually-recruited patterns of precision weighting thus accomplish a form of set-selection or strategy switching—an effect already demonstrated in some simple simulations of cued reaching under the influence of changing tonic levels of dopamine firing—see Friston et al. (2012).

Fast, efficient solutions have also been proposed in the context of reasoning and choice. In an extensive literature concerning choice and decision-making, it has been common to distinguish between “model-based” and “model-free” approaches (see e.g., Dayan & Daw 2008; Dayan 2012; Wolpert et al. 2003). Model-based strategies rely, as their name suggests, on a model of the domain that includes information about how various states (worldly situations) are connected, thus allowing a kind of principled estimation (given some cost function) of the value of a putative action. Such approaches involve the acquisition and the (computationally challenging) deployment of fairly rich bodies of information concerning the structure of the task-domain. Model-free strategies, by contrast, are said to “learn action values directly, by trial and error, without building an explicit model of the environment, and thus retain no explicit estimate of the probabilities that govern state transitions” (Gläscher et al. 2010, p. 585). Such approaches implement “policies” that typically exploit simple cues and regularities while nonetheless delivering fluent, often rapid, response.

The model-based/model-free distinction is intuitive, and resonates with old (but increasingly discredited) dichotomies between reason and habit, and between analytic evaluation and emotion. But it seems likely that the image of parallel, functionally independent, neural sub-systems will not stand the test of time. For example, a recent functional Magnetic Resonance Imaging (fMRI) study (Daw et al. 2011) suggests that rather than thinking in terms of distinct (functionally isolated) model-based and model-free learning systems, we may need to posit a single “more integrated computational architectureDaw et al. 2011, p. 1204), in which the different brain areas most commonly associated with model-based and model-free learning (pre-frontal cortex and dorsolateral striatum, respectively) each trade in both model-free and model-based modes of evaluations and do so “in proportions matching those that determine choice behavior” (Daw et al. 2011, p. 1209). Top-down information, Daw et al. (2011) suggest, might then control the way different strategies are combined in differing contexts for action and choice. Within the PP framework, this would follow from the embedding of shallow “model-free” responses within a deeper hierarchical generative model. By thus combining the two modes within an overarching model-based economy, inferential machinery can, by and large, identify the appropriate contexts in which to deploy the model-free (“habitual”) schemes. “Model-based” and “model-free” modes of valuation and response, if this is correct, name extremes along a single continuum, and may appear in many mixtures and combinations determined by the task at hand.

This suggests a possible reworking of the popular suggestion (Kahneman 2011) that human reasoning involves the operation of two functionally distinct systems: one for fast, automatic, “habitual” response, and the other dedicated to slow, effortful, deliberative reasoning. Instead of a truly dichotomous inner organization, we may benefit from a richer form of organization in which fast, habitual, or heuristically-based modes of response are often the default, but within which a large variety of possible strategies may be available. Humans and other animals would thus deploy multiple—rich, frugal and all points in between—strategies defined across a fundamentally unified web of neural resources (for some preliminary exploration of this kind of more integrated space, see Pezzulo et al. 2013). Some of those strategies will involve the canny use of environmental structure – efficient embodied prediction machines, that is to say, will often deploy minimal neural models that benefit from repeated calls to world-altering action (as when we use a few taps of the smartphone to carry out a complex calculation).

Nor, finally, is there any fixed limit to the complexities of the possible strategic embeddings that might occur even within a single more integrated system. We might, for example, use some quick-and-dirty heuristic strategy to identify a context in which to use a richer one, or use intensive model-exploring strategies to identify a context in which a simpler one will do. From this emerging vantage point the very distinction between model-based and model-free response (and indeed between System 1 and System 2) looks increasingly shallow. These are now just convenient labels for different admixtures of resource and influence, each of which is recruited in the same general way as circumstances dictate.[13]

3.2 Being human

There is nothing specifically human, however, about the suite of mechanisms explored above. The basic elements of the predictive processing story, as Roepstorff (2013, p. 45) correctly notes, may be found in many types of organism and model-system. The neocortex (the layered structure housing cortical columns that provides the most compelling neural implementation for predictive processing machinery) displays some dramatic variations in size but is common to all mammals. What, then, makes us (superficially at least) so very different? What is it that allows us—unlike dogs, chimps, or dolphins—to latch on to distal hidden causes that include not just food, mates, and relative social rankings, but also neurons, predictive processing, Higgs bosons, and black holes?

One possibility (Conway & Christiansen 2001) is that adaptations of the human neural apparatus have somehow conspired to create, in us, an even more complex and context-flexible hierarchical learning system than is found in other animals. Insofar as the predictive processing framework allows for rampant context-dependent influence within the distributed hierarchy, the same basic operating principles might (given a few new opportunities for routing and influence) result in the emergence of qualitatively novel forms of behavior and control. Such changes might explain why human agents display what Spivey (2007, p. 169) describes as an “exceptional sensitivity to hierarchical structure in any time-dependent signal”.

Another (possibly linked, and certainly highly complementary) possibility involves a potent complex of features of human life, in particular our ability to engage in temporally co-coordinated social interaction (see Roepstorff et al. 2010) and our ability to construct artifacts and design environments. Some of these ingredients have emerged in other species too. But in the human case the whole mosaic comes together under the influence of flexible and structured symbolic language (this was the target of the Conway and Christiansen paper mentioned above) and an almost obsessive drive (Tomasello et al. 2005) to engage in shared cultural practices. We are thus able to redeploy our core cognitive skills in the transformative context of exposure to what Roepstorff et al. (2010) call “patterned sociocultural practices”. These include the use of symbolic codes (encountered as “material symbols” (Clark 2006) and complex social routines (Hutchins 1995, 2014)—and more general, all the various ploys and strategies known as “cognitive niche construction” (see Clark 2008).

A simple example is the way that learning to perform mental arithmetic has been scaffolded, in some cultures, by the deliberate use of an abacus. Experience with patterns thus made available helps to install appreciation of many complex arithmetical operations and relations (for discussion of this, see Stigler 1984). The specific example does not matter very much, to be sure, but the general strategy does. In such cases, we structure (and repeatedly re-strutcture) our physical and social environments in ways that make available new knowledge and skills—see Landy & Goldstone (2005). Prediction-hungry brains, exposed in the course of embodied action to novel patterns of sensory stimulation, may thus acquire forms of knowledge that were genuinely out-of-reach prior to such physical-manipulation-based re-tuning of the generative model. Action and perception thus work together to reduce prediction error against the more slowly evolving backdrop of a culturally distributed process that spawns a succession of designed environments whose impact on the development (e.g., Smith & Gasser 2005) and unfolding (Hutchins 2014) of human thought and reason can hardly be overestimated.

To further appreciate the power and scope of such re-shaping, recall that the predictive brain is not doomed to deploy high-cost, model-rich strategies moment-by-moment in a demanding and time-pressured world. Instead, that very same apparatus supports the learning and contextually-determined deployment of low-cost strategies that make the most of body, world, and action. A maximally simple example is painting white lines along the edges of a winding cliff-top road. Such environmental alterations allow the driver to solve the complex problem of keeping the car on the road by (in part) predicting the ebb and flow of various simpler optical features and cues (see e.g., Land 2001). In such cases, we are building a better world in which to predict, while simultaneously structuring the world to cue the low-cost strategy at the right time.

3.3 Extending the predictive mind

All this suggests a very natural model of “extended cognition” (Clark & Chalmers 1998; Clark 2008), where this is simply the idea that bio-external structures and operations may sometimes form integral parts of an agent’s cognitive routines. Nothing in the PP framework materially alters, as far as I can tell, the arguments previously presented, both pro and con, regarding the possibility and actuality of genuinely extended cognitive systems.[14] What PP does offer, however, is a specific and highly “extension-friendly” proposal concerning the shape of the specifically neural contribution to cognitive success. To see this, reflect on the fact that known external (e.g., environmental) operations provide—by partly constituting—additional strategies apt for the kind of “meta-model-based” selection described above. This is because actions that engage and exploit specific external resources will now be selected in just the same manner as the inner coalitions of neural resources themselves. Minimal internal models that involve calls to world-recruiting actions may thus be selected in the same way as a purely internal model. The availability of such strategies (of trading inner complexity against real-world action) is the hallmark of embodied prediction machines.

As a simple illustration, consider the work undertaken by Pezzulo et al. (2013). Here, a so-called “Mixed Instrumental Controller” determines whether to choose an action based upon a set of simple, pre-computed (“cached”) values, or by running a mental simulation enabling a more flexible, model-based assessment of the desirability, or otherwise, of actually performing the action. The mixed controller computes the “value of information”, selecting the more informative (but costly) model-based option only when that value is sufficiently high. Mental simulation, in such cases, then produces new reward expectancies that can determine current action by updating the values used to determine choice. We can think of this as a mechanism that, moment-by-moment, determines (as discussed in previous sections) whether to exploit simple, already-cached routines or to explore a richer set of possibilities using some form of mental simulation. It is easy to imagine a version of the mixed controller that determines (on the basis of past experience) the value of the information that it believes would be made available by some kind of cognitive extension, such as the manipulation of an abacus, an iPhone, or a physical model. Deciding when to rest, content with a simple cached strategy, when to deploy a more costly mental simulation, and when to exploit the environment itself as a cognitive resource are thus all options apt for the same kind of “meta-Bayesian” model-based resolution.

Seen from this perspective, the selection of task-specific inner neural coalitions within an interaction-dominated PP economy is entirely on a par with the selection of task-specific neural–bodily–worldly ensembles. The recruitment and use of extended (brain–body–world) problem-solving ensembles now turns out to obey many of the same basic rules, and reflects many of the same basic normative principles (balancing efficacy and efficiency, and reflecting complex precision estimations) as does the recruitment of temporary inner coalitions bound by effective connectivity. In each case, what is selected is a temporary problem-solving ensemble (a “temporary task-specific device—see Anderson et al. 2012) recruited as a function of context-varying estimations of uncertainty.