Observing Others: Multiple Action Representation in the Frontal Lobe

See allHide authors and affiliations

Science  14 Oct 2005:
Vol. 310, Issue 5746, pp. 332-336
DOI: 10.1126/science.1115593


Observation of actions performed by others activates monkey ventral premotor cortex, where action meaning, but not object identity, is coded. In a functional MRI (fMRI) study, we investigated whether other monkey frontal areas respond to actions performed by others. Observation of a hand grasping objects activated four frontal areas: rostral F5 and areas 45B, 45A, and 46. Observation of an individual grasping an object also activated caudal F5, which indicates different degrees of action abstraction in F5. Observation of shapes activated area 45, but not premotor F5. Convergence of object and action information in area 45 may be important for full comprehension of actions.

Understanding actions performed by others is a fundamental social ability. There is now wide consensus that the activation of the motor system is a necessary requisite for this ability. A mere visual representation, without involvement of the motor system, provides a description of the visible aspects of the movements of the agent, but does not give information critical for understanding action semantics, i.e., what the action is about, what its goal is, and how it is related to other actions (1, 2). Action information, however, without knowledge about the identity of the object acted upon, is not sufficient to provide a full understanding of the observed action. Only when information about the object identity is added to the semantic information about the action can the actions of other individuals be completely understood (3).

The functional properties of a set of neurons in monkey ventral premotor cortex (area F5) provide evidence for the involvement of the motor system in action understanding. These “mirror” neurons discharge both when the individual performs an action and when the individual observes another person performing the same action (4, 5). They therefore match the observed action with its internal motor representation. F5 neurons responding to the observation of grasping respond equally well when a piece of food or a solid object of similar size and shape is being grasped. The object's identity appears to be ignored in F5 (4, 5).

We used fMRI in five awake monkeys (M1, M3 to M6) (69) to test how actions performed by others are represented in the monkey frontal lobe. In experiment 1, we intended to localize the frontal lobe regions involved in action observation. Monkeys saw video clips showing a full view of a person grasping an object (“acting person”), or an isolated hand grasping objects (“hand action”) and static single frames or scrambled videos as controls. The acting person movies approximate the visual stimulation used in F5 single-cell studies (4, 5) and provide context information that is lacking in the hand action movies, which has been used in most human imaging studies of action observation (10). The aim of experiment 2 was to identify areas combining shape sensitivity with sensitivity for action observation and to demonstrate selectivity for action observation as opposed to mere sensitivity for motion. The main stimuli in this experiment consisted of shapes, hand actions, and stationary and moving objects. In experiment 3, we examined whether the activation by action observation requires human effectors and the presence of an object. Monkeys viewed videos showing human and robot hands grasping and human hands performing a grasp with (goal-directed actions) and without an object (mimicked actions), as well as their static controls in two by two factorial designs (11).

Both a constrained analysis using anatomically defined regions of interest (ROIs) and the unconstrained statistical parametric mapping (SPM) analysis revealed multiple frontal regions involved in action observation. The unconstrained analysis (Fig. 1; fig. S1) revealed three local maxima in the frontal lobe for the contrast observation of hand actions versus static and scrambled controls. One local maximum (arrow 1, Fig. 1A) appears to include both banks of the arcuate sulcus (inferior branch), and two maxima are located on the cortical convexity. The most medial (arrow 2 in Fig. 1A) of these two activation sites most likely corresponds to area 46, and the lateral one (arrow 3 in Fig. 1A) to area 45A (12, 13).

Fig. 1.

(A) SPMs plotting voxels (colored red to yellow), which were significantly (P < 0.05, corrected for multiple comparisons) more active when monkeys viewed actions of the isolated hand compared with viewing their scrambled counterpart (upper row) or their static control (lower row) in the group of three monkeys (M1, M3, M5) superimposed on sagittal sections through the left hemisphere of M3, at levels ranging from –23 to –16. (B) SPMs plotting voxels (colored red to yellow), which were significantly (P < 0.05 corrected) more active when monkeys (group: M1, M3, M4, M5) viewed intact compared with scrambled images of objects, superimposed on sagittal sections through the left hemisphere of M3, at same levels as in (A). (C) Schematic representation of the four ROIs on four sagittal sections through M3's brain at same levels as in (A) and (B). P, principal sulcus; AI, arcuate sulcus inferior ramus; C, central sulcus; d, dorsal; v, ventral; a, anterior; and p, posterior. In (A), arrows point to local maxima in the arcuate sulcus (arrow 1), area 46 (arrow 2); and 45A (arrow 3).

The spatially constrained ROI analysis concentrated on the arcuate region, which we characterized architectonically (Fig. 2; fig. S2). The following architectonic fields were distinguished (14): area F5 convexity (F5c), F5 “posterior sector of posterior bank” (F5p), F5 “anterior sector of posterior bank” (F5a), and area 45B (13, 14). This parcellation provided the anatomical basis for defining four ROIs (Fig. 1C) used for statistical analysis.

Fig. 2.

Architectonic subdivisions of ventral premotor and prearcuate cortices. (Left) Low-power microphotograph of a sagittal section of M4 cortex stained for SMI-32 immunoreactivity showing areas F1, F4, F5p, F5a, 45B, and frontal eye field (FEF). The level of the section is indicated by the dashed line on the brain drawing shown in the inset. P, principal sulcus; AI, arcuate sulcus inferior ramus; AS, arcuate sulcus superior ramus; C, central sulcus; and IP, intraparietal sulcus. (Right) Higher magnification views of areas F5c, F5p, F5a, and 45B. F5c is taken from a more lateral section shown in fig. S2.

In experiment 1, observation of hand action, compared with static or scrambled controls, produced significant magnetic resonance (MR) signal increases in F5p, F5a, and 45B, both in the group and single subjects (Fig. 3 left; figs. S3 and S4, table S1). Observation of an acting person (Fig. 3, right; fig. S1) revealed significant signal increases in the same regions. In addition, the latter stimuli activated area F5c, the F5 sector where mirror neurons have been described (4). The interaction between type of action (hand or person) and conditions (action, scramble, or static) was significant in F5c (table S1).

Fig. 3.

Activity profiles of the four anatomically defined ROIs in experiment 1. The MR signal change (as a percentage) compared with fixation baseline (group data, n = 2) is plotted for the observation of the action and the observation of the three control stimuli: static hand selected from the middle of the action sequence (static middle), static hand at the end of sequence (static end), and scrambled video (scramble), for hand action videos in the left column and acting person videos in the right column (indicated by frame inset). Vertical bars indicate standard errors of the mean (SEMs) across functional volumes. Double red asterisks indicate that the action condition differed at the P < 0.001 corrected level from all three control conditions in both temporal and spatial statistics (see methods). In F5c, the acting person condition differed significantly from its three control conditions, but at different levels: P < 0.001 corrected from scrambled, P < 0.05 corrected from static middle, and P < 0.05 uncorrected from static end in the temporal statistic and P < 0.001 corrected for all three controls in the spatial statistics.

In experiment 2, we compared sensitivity for action observation with that for shape and motion. The observation of hand actions replicated the findings of experiment 1 (Fig. 4, left column; table S2). Viewing static intact shapes (8, 15) versus scrambled shapes activated area 45B, but none of the subdivisions of area F5 (Fig. 4, middle column; table S2). The interaction between ROIs and conditions (intact and scrambled shapes) was significant (14), indicating that, indeed, 45B differed from all three premotor ROIs with respect to shape sensitivity. These static shapes also activated putative area 45A on the cortical convexity (Fig. 1B). Similar results were obtained in the control test with the static and moving shapes (14). Comparison of hand actions and motion of the grasped object with their static counterparts (Fig. 4, right column) (14) revealed a significantly larger difference in activation in both F5a and area 45B for observation of actions than for viewing of moving objects, relative to their respective static controls. Finally, a further control test for motion sensitivity in general showed that F5a and 45B hardly respond to optic flow stimuli, which are known to drive area MT/V5 and its satellites in the superior temporal sulcus (14).

Fig. 4.

Activity profiles of the four ROIs in experiment 2. (Left) The percent MR signal change compared with fixation baseline is plotted for the observation of the hand action and the observation of static hands selected from the middle of the action sequences and scrambled video. (Middle) The percent change in MR activity compared with fixation baseline is plotted for observation of intact and scrambled images of objects (average of gray scale images and drawings). (Right) The MR activity change compared with fixation baseline is plotted for the action observation, observation of graspable object motion, and their static controls. Same conventions as in Fig. 3; n indicates number of monkeys in the different tests (group analysis). Double and single red asterisks, respectively, indicate significance at P < 0.001 corrected (both statistics) and at P < 0.05 corrected (both statistics). It is worth noting that the activation by the static object was significantly larger than that by the static hand in area 45B (P < 0.001 corrected for temporal and P < 0.05 corrected for spatial statistics).

In experiment 3, we attempted to clarify the nature of the action observation signals in the four ROIs. Observation of grasping performed by the robot hand activated both F5a and 45B (fig. S5, left column). The interaction between human and robot actions was significant in F5a, with stronger activation for human than for robot actions compared with their controls. No interaction was found in 45B (fig. S5, left column). Observation of mimicked human actions also activated both F5a and 45B (fig. S5, right column). The interaction between mimicked and goal-directed actions was significant in 45B, but not in F5a. In 45B, the signal was significantly stronger, compared with static controls, during goal-directed action than during mimicked action (fig. S5, right column).

Single-neuron studies (4, 5) have shown that the observation of others' actions activates neurons in F5c. In the present study, the fMRI technique provided a more complete description of the frontal areas activated by action observation. They include premotor area F5, which appears to house at least two distinct representations (F5c and F5a), and prefrontal areas 45B, 45A, and 46 [(12), for taxonomy (14)]. Also, human fMRI studies have shown that action observation evokes widespread frontal activation, including that of premotor area 6 and of prefrontal areas 44 and 45 (10).

The two premotor representations differ in their properties. F5c is active only when the observer sees an action that includes a view of its agent. The observation of a grasping hand alone is insufficient. Note that the mirror neurons were discovered and subsequently studied by testing them with the experimenter in full view (4, 5, 16). The second F5 action representation (F5a), located in the depth of the arcuate sulcus, appears to code actions in a less context-dependent way. The observation of an isolated arm action is already an effective stimulus for this representation. Similarly, the observation of a mimicked action, which is not effective in activating F5c (4), is as effective in F5a as is the observation of a goal-directed action. Finally, the observation of a grasping action performed by a robot hand, although less effective than the observation of a similar action performed by a human hand, also evokes a significant signal increase in F5a. Thus, the basic essence of grasping appears to be coded here, regardless of three facts: (i) that the coded action is without object, (ii) that there is no view of the action's agent, and even, (iii) that the grasping is done by an artificial device.

There are no single-neuron data available on the motor properties of F5a. Yet, considering its architectonic organization and the general properties of premotor areas, it is likely that F5a neurons have motor properties (17). Thus, the F5a action representation could be considered part of the mirror neuron system (18), like the F5c representation, but with more abstract, less context-related or less detailed properties.

Area 45B was among the frontal regions activated by action observation. Unlike all subdivisions of F5, it is activated not only by observation of an action, but also by observation of images of objects. Neurons, named canonical neurons, have previously been described in F5 that respond to real, graspable 3-D objects (19). These neurons are known to play an important role in the visuomotor transformation for grasping, but they do not appear to have any role in objects' identification. This view is consistent with our finding that none of the subdivisions of F5 was activated by the shape stimuli used in the present study (Fig. 4, middle column), with the exception of images of small graspable objects in Fig. 4 (right column, top panel). The small object activation in F5c might represent the signal of the canonical neurons and may indicate how to perform an action with the object. The difference between 45B and F5 in response to object images is consistent with their connection pattern with the posterior cortical areas. Area F5 receives its main visual input from the anterior intraparietal (AIP) area and PF-PFG areas in the inferior parietal lobule (20, 21), where objects are described for pragmatic purposes (22, 23), whereas area 45B is strongly connected with the inferotemporal cortex (24), where objects are described for the purpose of identification or recognition (25). Furthermore, shapes such as those used in the present study were shown to strongly activate inferotemporal cortex, with little or no activation in AIP or PF-PFG (8).

In conclusion, the frontal lobe of the monkey hosts multiple representations of others' actions. The representation located in the caudal part of F5 (14) is context dependent and is activated only when the agent is seen, whereas the representations located in rostral F5 and in the prefrontal lobe code the action as such. Furthermore, the activation of 45B is little influenced by which agent (human hand or robot hand) performs the action, whereas in rostral F5, the observation of human hand action is more effective. Finally, vision of shapes in general, whether images of graspable or not graspable objects, activated 45B, but not area F5.

In humans, areas 44 and 45, the areas considered the homologs of F5a and 45, respectively (13, 14, 26), play a fundamental role in speech. Language typically describes actions in abstract terms. For example, the sentence “a hand grasping an apple” does not specify any characteristics of the hand, of the apple, nor of the movements bringing the hand to the object. In contrast, all these aspects are inherent to the visual and visuomotor representations of the same action. Thus, it is plausible that the transition, in the monkey frontal lobe, from context-dependent descriptions in F5c to more abstract descriptions in F5a and 45B represents the ancient prelinguistic basis from which the abstract description of an action, necessary for language, evolved. In turn, the action plus the shape description found in area 45B, could be seen as the prelinguistic link between the verb and the object. As Lieberman (27) wrote: “it is there (where action is represented), not in the remote recesses of cognitive machinery, that the specifically linguistic constituents make their first appearance.” Our findings support this contention.

Supporting Online Material

Materials and Methods

SOM Text

Figs. S1 to S7

Tables S1 and S2

Movies S1 to S8

References and Notes

References and Notes

View Abstract

Navigate This Article