6 Predictions, distinctness, fecundity

It will be useful to discuss a concrete example of explanatory contest for the free energy principle. A good example comes from Ned Block & Susanna Siegel (2013) who argue against Andy Clark’s (2013) version of the predictive processing framework in a way that pertains to the preceding remarks about explanatory prowess and ambition. In a comparison with an existing theory of attentional effects (proposed by Marisa Carrasco), they argue first that the predictive framework makes false predictions, and second that it offers no distinctive explanations.

As to the first point, Block and Siegel consider the effect where covert attention to a weak contrast grating enhances its perceived contrast. They argue that this increased contrast should be unexpected and therefore should elicit a prediction error that in turn should be extinguished, thereby annihilating the perceptual effect that the account was meant to explain in the first place. However, their argument does not rely on the correct version of the free energy account of attention. Block and Siegel overlook the fact that attention is itself predictive, in virtue of the prediction of precision. This means that attention enhances the prediction error from the weak grating, which in turn is explained away under the hypothesis that a strong contrast grating was present in that location of visual space. This conception of attention thus does yield a satisfactory account of the phenomenon that they claim cannot be explained (attentional enhancing), and it does not generate the false predictions they suggest (Hohwy 2013).

Block and Siegel’s second point is more difficult to get straight. They argue that the predictive account offers no explanation of attentional findings, in particular relating to receptive field distortions; they then suggest that the account could adopt the existing theory, which asserts that “representation nodes” have shrinking receptive fields. They continue to argue that since the purported prediction error gain relates to error units in the brain rather than representation nodes, the prediction error account cannot itself generate this explanation. The argument is then that if the prediction processing account simply borrows that explanation (namely the existing explanation in terms of representation nodes), it hasn’t offered anything distinctive. Again, this rests on an incorrect reading of the free energy account: error units are not insulated from representation units. Error units receive the bottom-up signal and this leads to revision of the predictions generated from the representation units. The outstanding question is how the distortions of receptive fields can be explained within the prediction error account.

This question has been addressed within the predictive coding literature. Thus Spratling (2008), who is a proponent of predictive coding accounts of attention, says (referring to the literature on changing receptive fields to which Block and Siegel themselves appeal) “the [predictive processing] model proposes, as have others before, that the apparent receptive field distortion arises from a change in the pattern of feedforward stimulation received by the cell”. That is, increased gain explains the distortion of the receptive field.

In fact, one might speculate that the predictive processing story makes perfect sense of the existence of modulable receptive fields. The receptive field of a given representational unit would, that is, be a function of the prediction error received from below, where—as described earlier—lower levels operate at smaller spatiotemporal scales. To give a toy illustration, assume that a broad receptive field would receive an equal amount of error signal from ten lower units each with smaller receptive fields, whereas a narrow receptive field receives error only from two such units. For the broad receptive field, if the gain on error from lower unit numbers one and two increases due to attention, then the gain on the other eight units decreases (since weights sum to one). Now, the hitherto broad receptive field mainly receives error from two lower units, so its receptive field has automatically shrunk. Attentional effects thus track the effects of expected precisions.

Here a more specific point can be made about Block and Siegel’s argument. The predictive processing account of attention can potentially offer a distinctive explanation of rather finegrained attentional findings. There is also reason to think that this explanation has more promise than existing theories. This is because the existing theories help themselves to the notion of ‘representational nodes’ whereas the free energy principle explains what these are, what they do, and how they connect with other nodes. Moreover, the prediction error account can deal very elegantly for key receptive field properties (Rao & Ballard 1999; Harrison et al. 2007).

This seems to be a good example of the situation outlined earlier with respect to the contest between the free energy principle and existing theories. The free energy principle can explain more types of evidence, under a more unificatory framework, and this immediately begins to undermine existing theories. Specifically, the theory that has no role for prediction error in receptive field modulation and activation only in representation nodes is explained away, even if it has significant evidence in its favour.

Underlying this story, there are some larger issues in the philosophy of science. One issue concerns the role of unification in explanation (Kitcher 1989). This is the idea that there are explanatory dividends in explanations that unify a variety of different phenomena under one theory. Obviously the free energy principle is a strong, ambitious unifier (perception, action, and attention all fall under the principle). Whereas there is discussion about whether this in itself adds to its explanatory ability as such, the ability to unify with other areas of evidence is part of what makes an explanation better than others. Noting this aspect of the free energy principle therefore supports it, in an inference to the best explanation (Lipton 2004, 2007). Confronted with a piecemeal explanation of a phenomenon and a unificatory explanation of the same phenomenon, the inference to the latter is stronger. There may be some difficult assessments concerning which explanation best deals with the available evidence. In the case discussed above, the free energy principle can explain less of the attention-specific evidence than the piecemeal explanation, but on the other hand it can explain more kinds of evidence, it provides explanatory tools that are better motivated (roles of representation and error 725 units), and it offers a more unifying account overall.

A second issue from the philosophy of science, in particular concerning inference to the best explanation, is the fecundity of an explanation, which is regarded as a best-maker. The better an explanation is at generating new predictions and ways of asking research questions, the stronger is the inference in its favour. Whereas this is not on its own a decider, it is an important contributor to the comparison of explanatory frameworks. Block and Siegel also seem to suggest that the predictive framework has nothing new to offer, or at least very little compared to existing (piecemeal) theories. Their example of a piecemeal theory is Carrasco’s impressive work on attention, which has proven extraordinarily fecund, leading to a series of discoveries about attention. Assessing which theory is the more fecund is difficult, however, and involves considerations of unification. The free energy principle, as described above, does not posit any fundamental difference between perception and action. Both fall out of different re-organisations of the principle and come about mainly as different directions of fit for prediction error minimization (Hohwy 2013, 2014). This means that optimization of expected precisions, and thereby attention, must be central to action as well as to perception. This provides a whole new (and thus fecund) source of research questions for the area of action, brought about by viewing it as an attentional phenomenon. Important modeling work has been done in this regard (Feldman & Friston 2010), age-old questions (such as our inability to tickle ourselves) have been re-assessed (Brown et al. 2013), and new evidence concerning self-tickle has been amassed (Van Doorn et al. 2014). Theoretically, this has led to the intriguing idea that action occurs when attention is withdrawn from current proprioceptive input (described above). This idea points to a fully integrated view of attention, where attention is ubiquitous in brain function (with deep connections to consciousness, Hohwy 2012).

There is thus fecundity on both sides of this debate. It is difficult to conclusively adjudicate which side is more fecund, in part because the new research questions are in different areas and with different theoretical impact. It is surprising to be told that too much attention can undermine acuity—which is an example from Block and Siegel—but it is also surprising to be told that action is an attentional phenomenon.

The third issue from the philosophy of science concerns theory subsumption. It would be very odd if the explanations associated with the free energy principle (e.g., that attention is optimization of expected precision) completely contradicted all existing, more piecemeal explanations of attention. It should be expected that explanations of attention have some overlap with each other, as they are explaining away overlapping bodies of evidence. Indeed, the free energy explanation seems to subsume elements of biased competition theories of attention, as well as elements of Carrasco’s theory, as seen above. This raises the question of to what extent a new theory, like the free energy principle’s account of attention, really contributes a new and better understanding, especially if it carries within it elements of older theories. One way to go about this question again appeals to inference to the best explanation. The new and the old theories overlap in some respects, but they differ in respect of further elements of unification, theoretical motivation, broadness, fecundity, and so on. It can be difficult to come up with a scheme for precise assessment of these features, but it seems not unreasonable to say that the free energy principle performs best on at least those further elements of what makes explanations best.

At this stage it is tempting to apply the free energy principle to itself. This is an apt move since the idea of the hypothesis-testing brain arose in comparison with scientific practice (Helmholtz 1867; Gregory 1980). On this view, the point of a good scientific theory is to minimize prediction error as well as possible, on average and in the long run. This imputes an overall weighting of all the very same elements to science as we have ascribed to the brain above: revise theories in the light of evidence, control for confounds by making experimental manipulations, be guided by where highly precise evidence is expected to be found, adopt simple theories that diverge minimally from old theories, and let theories have a hierarchical structure such that they can persist in the face of non-linearities (due to causal interactions) in the evidence. All of these considerations speak in favour of the free energy principle over piecemeal, existing theories. By absorbing and revising older theories under the hierarchically imposed scientific “hyperparameter” of the free energy principle, it seems a very reasonable weighting of all these aspects can be achieved. For example, aspects of Carrasco’s theory are subsumed, but under revised accounts of its notions of the functional role of representation nodes; due to the hierarchical aspect it is able to account for evidence arising under attentional approaches to action; in addition, this subsumption may be fecund, since we could expect it to lead to new findings in action (for example, a prediction that there will be attentional enhancement in the sensorimotor domain, leading to “illusory action”).