5 1-3sE– Levels of social embodiment

In this section, I will introduce an alternative framework in which I describe different processing stages of social understanding as different levels of social embodiment. Before I go into detail about how to apply 1-3E to social understanding, let me motivate my strategy here. I have already pointed out why MV yields an attractive theoretical assumption for research on social cognition. It allows, to briefly repeat, the integration of different aspects of a manifold phenomenon and thus aims to give a comprehensive perspective that is able to encompass sub-areas of interest and research. The advantage of couching MV into 1-3E is that its hierarchical nature affords this integration at different levels of description, while operating on a set of coherent background assumptions. As a grounding theory, it suggests how different levels of analysis relate and at the least has the potential to assign an important role to aspects that lay outside an individual brain. As such it can also do justice to demands from the interactive turn, viz. the consideration of interaction dynamics and their possible role for social cognition as well as taking the phenomenology of social encounters seriously. However, MV suffers from the problem of metaphysical incompatibility. 1-3E, on the other hand, is a representational account that offers a metaphysically sound ground for a manifold phenomenon. My goal is to scaffold a framework for human social cognition, which, as I will argue, can be described as a case of 3E in non-pathological human individuals.

I will now briefly give a rough overview of my proposal of a three-level model of social understanding which I dub “1-3sE” (first-order social embodiment, second-order social embodiment, third-order social embodiment)[12], before I go into detail about what each level amounts to. As in the original version of the framework, levels of social embodiment represent both levels within a system and different kinds of systems. I thus assume that each social third-order system possesses first- and second-order social embodiment, too. In this commentary, I will focus on describing levels of embodiment within social systems, since this aspect of the framework is of greater importance for a pluralistic view of social cognition.

As previously mentioned, I take it that 1sE fulfills a twofold function: First, it serves as the implementational level of description, showing which physical parts ground higher-level, representational and phenomenal processes. Second, low-level sensorimotor mechanisms subserve basic social interactions (e.g., coupling or synchronization). 2sE involves the instantiation of a model which pre-reflexively represents features of the body. It is assumed that parts of this body model can be shared and thus functionally underlie social cognitive processes that may well operate at the unconscious level, such as imitation, joint attention and action understanding. Finally, 3sE describes cases of consciously experienced social understanding. I claim that there are various kinds of phenomenal experiences in social situations that can be differentiated by applying the concepts of transparency and opacity. Since I consider opaque social mental states to exhibit a very special kind of experience, which is not only rare, but might also entail an additional level of representation, I introduce an extra level: 3sE+. I will now describe the specific levels and their relation in more detail, before I show how my view overcomes the shortcomings of MV.

5.1 Third-order social embodiment (3sE)

Individuals that phenomenally represent themselves as social individuals can be described as social 3E systems (3sE). There are certainly many different ways in which humans experience themselves as being social, but I will focus on those that are mentioned by Newen: DP, personal-level simulation, and explicit theoretical inference.

The concepts of transparency and opacity allow a more fine-grained distinction of different phenomenal experiences of social encounters, as they offer a way to emphasize the similarities and differences between various phenomenal qualities in social situations. DP describes the experience that I can, without being aware of any intermediary steps, understand another person. Importantly, as Zahavi points out, the perceived directness still holds in cases of “unsuccessful” social understanding, such as deception or misunderstandings (cf. 2011, pp. 548–549). Although I can get what you say completely wrong, for example, I would still experience myself as immediately understanding what you are saying.[13] Since, as I have discussed earlier, the experiential nature of a mental state is not to be equated with its epistemic complexity, we can assume that DP operates on several subpersonal mechanisms. These are, however, not explicitly represented. Hence it makes sense to describe DP as resulting from transparent social cognitive states. By doing so, it is possible to keep its phenomenal status as immediate and direct, while not equating this quality with its epistemic status. In contrast, theorizing and personal-level simulation have a quite different phenomenal characteristic. In these cases, the process of constructing a specific insight about the other is part of the experience, may this be by explicitly simulating the person (e.g., “If I was her, what would make me excited about having a cat?”), or through theoretical inference (e.g., “People usually own cats to feel less alone, maybe she is excited to have a furry companion now”). They can thus be said to result from opaque social cognitive states. What distinguishes transparent from opaque states is the degree to which one’s own social cognitive processing, which is directed at the other person, is explicitly represented as a process.

However, as already mentioned, I see the need to modify Metzinger’s conception of 3E in order to reflect a proper distinction between transparent and opaque social states. I claim that opaque states exhibit an additional level of representation, since the representation process itself is part of the phenomenal experience. In order to emphasize that this is a special and probably rare phenomenon, I introduce the level of “3sE+”. Both transparent and opaque social states are certainly to be located at the third level of embodiment, since they possess phenomenal properties. Metzinger suggests that the distinctive feature of 3E in contrast to lower levels is that it enables the system to identify itself with its body (cf. Metzinger 2014, p. 274). The resulting phenomenal properties of self-identification and selfhood stem from the experienced immediacy that comes with transparency (cf. ibid., p. 273). If this is the case, it can be assumed that phenomenal states are not either transparent or opaque, but that transparency is part of any phenomenal state. The degree to which the representation process is explicitly represented varies, transparency and opacity are thus gradually arising properties (cf. Metzinger 2003, p. 358). Additionally, it could well be that there is a constant oscillation between transparency and opacity, depending – for example – on specific contexts and situations. However, opacity and the resulting experiences seem to be more high-level features that can only be found in a small subgroup of species. This is obvious in social understanding, since full-fledged theoretical inference and high-level simulation are not very likely to be found in most non-human animals and human infants. It seems that in the case of opaque states there is an additional level of representation that requires a higher level of sophistication, which should be made more explicit in the hierarchical framework. Transparent and opaque mental states – at least in this case for social understanding – reflect two different kinds of phenomenal experiences that might also have different underlying mechanisms. I thus introduce, in order to do justice to this difference, an additional level of 3sE, namely 3sE+. 3sE+ describes those phenomenal states during which one is aware of the constructing process and which occurs in situations that require this kind of reasoning in order to disambiguate the input. This additional distinction at the level of 3sE enables a more detailed view and underlines the difference between transparency and opacity.

One question that arises at this point is the following. We have assumed that opacity means to phenomenally represent (parts of) the actual process of representation. Does that mean that in the case of theorizing and simulation one would find their underlying representational processes to be subpersonal kinds of theoretical inference and simulation? There are two points that speak against this assumption. First, there are justified worries that the conception of implicit theorizing as an unconscious process stretches the concept of a theory too far (e.g., Blackburn 1992). These arguments against TT have been presented extensively in the literature and I will thus not repeat them here. In the case of simulation, secondly, it seems that subpersonal or low-level simulation does not necessarily generate the phenomenal experience of simulating. Consider the many studies that have been conducted to explore whether the activity of the mirror neuron system can be seen as a kind of implicit simulation that enables social understanding (for a review, see for example Cattaneo & Rizzolatti 2009). In most of these experiments that found mirror neuron activity to be correlated with social understanding, it seems that the phenomenal experience has the character of DP rather than explicit simulation.[14] Such a view, as I hope to have shown, has two advantages. It describes different kinds of phenomenal experiences in social encounters and distinguishes them by referring to the concepts of transparency and opacity.

5.2 Second-order social embodiment (2sE)

Assuming that there is something like a representational body model, we can now ask which parts of it can be exploited for social cognition. In order to do so, let me briefly recapitulate how to conceive of this body model. It has been described as a “grounded, predictive body model that continuously filters data in accordance with geometrical, kinematic and dynamic boundary conditions” (Metzinger 2014, p. 273). Furthermore, Metzinger predicts that parts of this model can be shared by individuals: “[…] on a certain level of functional granularity, this type of core representation [i.e., the body model] might also describe the generic, universal geometry which is shared by all members of a biological species” (ibid., p. 273; see also Schilling & Cruse 2012). Together with Gallese he argues elsewhere that the mirror neuron system plays a crucial role in generating a basis for both an “internal model of reality” as well as a “shared action ontology” (Metzinger & Gallese 2003, p. 550). This means, as I take it, that the body model contains information that represents one’s own body, but is not completely self-specific. To see this, consider that in order to be shared, representations must not be too specific as to not generalize to the bodies of others. I will come back to this point soon. This consequence worried Newen, leading him to reject the view that mirror neurons form a basis for social cognition:

Why are mirror neurons not an essential part of understanding others? They represent a type of action or emotion that is independent from a first- or third-person perspective; but the distinction between self and other is an essential part of understanding others (this collection, p. 4).

This raises the question of what exactly it is that can be shared by individuals. Since these considerations are central to the possibility of exploiting the body model for social understanding, I now aim to refute the worry and give a possible answer to the question.

Mirror neurons were discovered in the premotor cortex of macaque monkeys more than 20 years ago. They fire, as is famously known, both when an individual executes and observes an action (Gallese et al. 1996; Rizzolatti et al. 1996; Rizzolatti & Craighero 2004). Although there is considerable controversy about their existence in humans (Hickok 2009), their actual function (Jacob 2008), and their explanatory power (Borg 2007), they are considered by many researchers to form one of the crucial systems for understanding others (e.g., Stanley & Adolphs 2013, p. 512). Mirror neurons are indeed neutral to the agent of an action – they fire whether an action is executed by oneself or another person. Insofar, critics are right to say that it is not obvious how they could provide the important distinction between self and other. However, it seems that there are two important facts left out in this line of thinking. Firstly, it has been suggested that there are inhibition mechanisms that “control” shared representations and provide the basis for a self-other distinction (for a more detailed discussion, see Brass et al. 2009). Secondly, mirror neurons have always been presented as being embedded in a system (hence mirror neuron system, e.g., Cattaneo & Rizzolatti 2009; Iacoboni & Dapretto 2006; Rizzolatti & Craighero 2004). This system consists of areas which contain mirror neurons, but also regions which contain neurons that do not have bimodal properties and encode only self-generated actions, as described by Jeannerod & Pacherie (cf. 2004, pp. 131–132).[15] Thus, it is correct that mirror neurons alone do not distinguish between self and other. However, this is a rather impoverished view, since they should never be considered in isolation. A similar thought which helps to refute the worry is given by De Vignemont who adopts the view that mirroring can be seen as sharing body representations (2014a). She argues that shared body representations do not threaten a self-other distinction because they always contain information that is too self-specific to be shared. They are, in her words, “[…] Janus-faced. They face inward as representations of one’s body and they face outward as representations of other people’s bodies” (De Vignemont 2014b, p. 135).

A closer look at her conception also yields a possible answer to the question of what it is that can be shared with others. De Vignemont argues that it must be a rather coarse-grained representation of one’s body, since bodies differ considerably in many aspects like size, gender, posture etc. This representation, which De Vignemont dubs the “body map” (De Vignemont 2014a, p. 289, 2014b, p. 134), contains information about the basic configuration of body parts and thus serves as a functional tool to localize bodily experiences. Irrespective of individual differences of this map, some of its content is so coarse-grained that humans are still able to imitate others or experience vicarious bodily sensations, both of which have been claimed to draw on shared body representations. In other words, what can be shared is that part of the body map whose content is general enough to apply to all kinds of bodies, no matter their differences.

Although this is surely no exhaustive inquiry of the matter, these thoughts provide an idea of how to view 2sE as enabling social cognition: at the representational level, there are parts of the body model which can be shared with others.[16] These parts, however, have to be embedded in a system that also contains self-specific information. Otherwise it would be impossible to attribute an action, an experience or observation to the agent concerned. It now becomes obvious why I claimed earlier that a self-other distinction does not need a phenomenal representation of one’s body. The unconscious body model and its shared parts seem well furnished to provide such a distinction and thus make unconscious social processes such as mimicry and involuntary imitation possible.

5.3 First-order social embodiment (1sE)

Although interaction is certainly a topic that has been the least explored by researchers of social cognition, it nevertheless should be considered carefully by any theory that aims to provide a comprehensive view on social understanding. Including interaction is particularly challenging, since most attempts to do so came from proponents of an enactive perspective on the mind. However, I have argued that a pluralistic model of social cognition cannot simply combine enactive claims with cognitive ones (see section 3 “Multiplicity needs coherence”). What is needed is an approach of social understanding that integrates interaction as a phenomenon that most probably does not need explicit, high-level representation. 1sE offers a way to describe such low-level social processes. Knoblich and Sebanz, for example, review several cases of “social coupling”. Individuals tend to synchronize their movements if they are sitting next to each other in a rocking chair (cf. Knoblich & Sebanz 2008, p. 2022), a process which can plausibly be described without representation. This sort of “entrainment” (ibid., p. 2023) is a case of coupling during which individuals influence each other’s behavior without consciously intending to do so. There are also cases in the animal kingdom that can be described at the level of 1sE, such as the formation and synchronization of fireflies (Suda et al. 2006).

The next step is to depict the implementation of specifically “social parts” of the body model. What physically grounds them is described at the level of 1sE. One buzzword in the research field of social cognitive neuroscience is “the social brain” (e.g., Dunbar 1998; Gazzaniga 1985). This term refers to all the different areas in the brain that have been found to be correlated to cognitive processing in social situations, including, of course, the mirror neuron system. While the investigation of brain regions and their functions for social cognition is a well-established endeavor, it will be more interesting to look at other possibilities of implementing social cognition. The role of interaction for social cognition, for example, has been hotly disputed in the research field. As I have illustrated earlier, some claim that interaction dynamics could constitute social cognitive mechanisms (De Jaegher et al. 2010). However, such a view is only sustainable in a radically enactive set of assumptions and as such is not an option for the framework I am suggesting here. What should be considered, though, is whether being in an interaction is necessary for some social cognitive states. It has been suggested by recent studies that activation patterns differ depending on the situational context and the degree of emotional engagement in a social situation (Schilbach et al. 2013). These results point to this possibility, but it still needs more careful investigation whether or not they justify the claim that interaction in any way physically grounds or enables social cognition.

Such basic and non-representational forms of social understanding have been neglected by the research field for a long time and are in need of more empirical and philosophical investigation. Especially research on joint action and coupled systems is therefore important to sort out 1sE.