Social cognition refers to cognitive processes, perceptions, and subjective experiences related to interaction with conspecifics. This section asks: Which are the brain mechanisms underlying pre-reflective aspects of social cognition? Could there be alternative theories and empirical evidence countering the primary role of motor resonance?
Gallese and Cuccio propose that social cognition mainly depends on ES based on motor resonance and processing of mirror neurons (see citations in Gallese & Cuccio this collection). Mirror neurons were initially discovered in fronto-parietal networks of the macaque monkey brain. They are a specific type of canonical neuron involved in planning and executing hand actions and were found to be activated both when the monkey executed a specific grasping or reaching action and when the monkey passively observed somebody performing similar actions (Gallese et al. 1996; Rizzolatti et al. 1996). Neuroimaging studies in humans also showed mirror neuron-like activation patterns at the level of populations of neurons in distinct brain regions—mainly the ventral premotor cortex (vPM), the intraparietal sulcus (IPS), but also the insula cortex and the secondary somatosensory cortex (Rizzolatti & Sinigaglia 2010; see also figure 1a gray dots). ES proposes that based on mirror neurons the brain maps observed actions into an action space, into motor potentialities, within our hierarchically-organized motor system, and thereby infers and predicts the action goals of the individual. In this way it penetrates the state of mind of the other, and thus links self and other in a pre-reflective empathical fashion (Gallese & Cuccio this collection, p. 7).
I would like to point out that motor resonance, i.e., the mapping of observed actions into motor potentialities, necessarily depends on multisensory spatial coding. I argue that this is the case because of five points: First, the brain has access to the physical world only through the different sensory receptors of the body that bombard it with exteroceptive (e.g., vision, audition), proprioceptive (somatosensory, vestibular), and interoceptive (somatosensory, visceral) signals. Second, these multisensory signals must be integrated according to their spatial and temporal parameters (Stein & Stanford 2008) to inform neural representations of the states of the body and of the world around us—including the agents whose actions are subject to motor resonance. Third, the observed movements of these agents are coded in coordinates distinct from the egocentric spatial frame of reference upon which our motor system operates. Fourth, the brain must necessarily perform spatial transformations of the observed movements by the other agent into the egocentric frame of reference, upon which motor resonance can operate. In sum, multisensory spatial coding is a pre-requisite of motor resonance.
According to Gallese and Cuccio, the outcomes of such multisensory spatial coding are readily available to the brain network of ES through anatomical connections to the vPM that are “anatomically connected to visual and somatosensory areas in the posterior parietal cortex and to frontal motor areas” (Gallese & Cuccio this collection, p. 10). However, it seems that the multisensory spatial coding required for a precise description of complex motor acts might be computationally costly. Might there be a computationally more effective alternative by which multisensory spatial coding is used to decode the intentions of observed agents?
The attention schema (AS) theory of awareness (Graziano 2013; Graziano & Kastner 2011) proposes that brain mechanisms related to attention and spatial coding, which are distinct from neural processing relevant to ES, primarily underlie pre-reflective aspects of social cognition. Graziano and Kastner define attention as an information-handling mechanism of the brain that serves to give priority to some information (e.g., representational features) out of several equally probable alternatives that are in constant competition for awareness. Furthermore, awareness is defined as the process of consciously experiencing something, it is the process of relating the subject (i.e., a phenomenal self, see also Metzinger 2003) to the object/content of experience. Graziano and Kastner summarize AS as follows:
[Awareness is information and] depends on some system in the brain that must have computed [it] […]; otherwise, the information would be unavailable for report. […] People routinely compute the state of awareness of other people [and] the awareness we attribute to another person is our reconstruction of that person’s attention. […] The same machinery that computes socially relevant information […] also computes […] information about our own awareness. […] Awareness is […] a perceptual model […] a rich informational model that includes, among other computed properties, a spatial structure. […] Through the use of the social perceptual machinery, we assign the property of awareness to a location within ourselves. (Graziano & Kastner 2011, pp. 98–99)
Related to social cognition, AS proposes that by using a schematic representation of the state of attention of other individuals—including a prediction of the spatial location of their focus of attention—we predict the current state of awareness of the individual, which is informative about their intentions and potential future actions. In short: Awareness of others is an attention schema. As compared to ES, AS is a relatively recent theory that requires extensive empirical studies. Yet the evidence so far shows that indeed the brain has a neural circuitry for monitoring the spatial configuration of one’s own attention independent of the sensory modality (Downar et al. 2000), including the direction of gaze (Beck & Kastner 2009; Desimone & Duncan 1995). These structures are the proposed neural expert system upon which AS is based and consist of the right-hemispheric temporo-parietal junction (TPJ) and superior temporal sulcus (STS) (see figure 1a in black). Notably, this expert system relevant to AS shows little anatomical overlap with the neural structures relevant to ES (figure 1a compare black with gray).
Because the AS relies on coding of the spatial relationship between the location of the observed individual and the likely spatial location of this individual’s attention (i.e., independent of a particular sensory modality), the required spatial computations seem simple and straightforward. They require two points, i.e., the individual as a reference point and the potential spatial location of the attention of that individual. According to AS, using such spatial labeling the brain is able to simultaneously track the aware and attending minds of several individuals simultaneously. Thus, spatial coding in the context of AS appears to be less complex and less computationally demanding than spatial transformations underlying ES (see above).
Which of these seemingly distinct brain mechanisms proposed by AS and ES more plausibly underlies social cognition: the neural expert system decoding the state of attention according to AS or the mirror mechanism system decoding observed motor plans according to ES? It has been proposed that AS and ES may in principle work together. Graziano and Kastner propose that the expert system of AS may take a leading role by formulating a hypothesis about the state of awareness of an individual that is likely to drive further behavior and therefore provide a set of predictions based upon which motor resonance could more efficiently perform simulations (Graziano & Kastner 2011). Motor resonance would thus add richer detail to the state-of-attention hypothesis made by the expert system.
This combined mechanism is compatible with the predictive processing principle (Clark this collection; Hohwy 2013, this collection), which has been proposed relevant to the bodily self (Apps & Tsakiris 2013; Limanowski & Blankenburg 2013; Seth this collection). According to predictive processing the brain constantly predicts the potential causes of sensory input by minimizing prediction errors via update of the predicted causes or by action that changes sensory input (Friston 2005). Applying the predictive processing principle to Graziano and Kastner’s proposal that AS is a hypothesis-generating tool to which ES adds further detail, one could conceive of both mechanisms as different predictive processing modules aimed at anticipating the state of awareness and of intentional actions observed in others. Although no empirical study so far has addressed this specific hypothesis, a recent functional magnetic resonance imaging study found that predictive processing principles accounted for the blood oxygen-level dependent activity related to the perception of faces, which is an important perceptual function for social cognition in the human species (Apps & Tsakiris 2013).
These common and distinct predictions based on ES, AS, and predictive processing call for empirical research aimed at providing evidence to further refine, integrate, or reject them.
Figure 1: Summary of cortical brain regions involved in social cognition, the bodily self, and vestibular processing. (a) Whereas for social cognition there is little overlap between the brain regions proposed relevant for the attention schema (in black) and embodied simulation (in gray), both sets of brain regions overlap with (b) the brain network of the bodily self as identified by full-body illusion experiments manipulating self-location and first-person perspective (in black) and the body-swap illusion manipulating mainly body ownership (in gray). (c) The human vestibular cortical regions (in black) are widely distributed and overlap with several regions relevant to both the bodily self and social cognition. (The images are derived from images by NASA, licensed under creative commons.)