Report

Neural Mechanisms of Object-Based Attention

See allHide authors and affiliations

Science  25 Apr 2014:
Vol. 344, Issue 6182, pp. 424-427
DOI: 10.1126/science.1247003

House or Face?

The neural mechanisms of spatial attention are well known, unlike nonspatial attention. Baldauf and Desimone (p. 424, published online 10 April) combined several technologies to identify a fronto-temporal network in humans that mediates nonspatial object-based attention. There is a clear top-down directionality of these oscillatory interactions, establishing the inferior-frontal cortex as a key source of nonspatial attentional inputs to the inferior-temporal cortex. Surprisingly, the mechanisms for nonspatial attention are strikingly parallel to the mechanisms of spatial attention.

Abstract

How we attend to objects and their features that cannot be separated by location is not understood. We presented two temporally and spatially overlapping streams of objects, faces versus houses, and used magnetoencephalography and functional magnetic resonance imaging to separate neuronal responses to attended and unattended objects. Attention to faces versus houses enhanced the sensory responses in the fusiform face area (FFA) and parahippocampal place area (PPA), respectively. The increases in sensory responses were accompanied by induced gamma synchrony between the inferior frontal junction, IFJ, and either FFA or PPA, depending on which object was attended. The IFJ appeared to be the driver of the synchrony, as gamma phases were advanced by 20 ms in IFJ compared to FFA or PPA. Thus, the IFJ may direct the flow of visual processing during object-based attention, at least in part through coupled oscillations with specialized areas such as FFA and PPA.

When covertly attending to a location in the periphery, visual processing is biased toward the attended location, and the sources of top-down signals include the frontal eye fields (FEF) (1, 2) and parietal cortex (PC). FEF may modulate visual processing through a combination of firing rates and gamma frequency synchrony with visual cortex (2). For nonspatial attention, the mechanisms of top-down attention are much less clear. When people attend to a feature, such as a particular color (35), or to one of several objects at the same location (68), activity in the extrastriate areas representing properties of the attended object is enhanced. But where do the attentional biases (9) come from, and how do they enhance object processing when the distractors are not spatially separate?

We combined magnetoencephalography (MEG), supplemented by functional magnetic resonance imaging (fMRI) and diffusion tensor imaging, to optimize both spatial and temporal resolution. In the MEG experiment, two spatially overlapping streams of objects (faces and houses) were tagged at different presentation frequencies (1.5 and 2.0 Hz) (Fig. 1, A and B) (5, 1012). The stimuli went in and out of “phase coherence,” so that they were modulated in visibility over time but did not change in luminance or flash on and off. When subjects were cued to attend to one of the streams and to detect occasional targets within the cued stream, frequency analyses allowed identifying brain regions that followed the stimulus oscillations.

Fig. 1 Stimuli and attention.

(A) Stimuli used in the MEG experiment (see online methods). (B) Sequence of stimuli consisting of an overlay of two streams of objects (faces and houses), fading in and out of a phase-scrambled noise mask at different frequencies (1.5 and 2.0 Hz). Subjects had to attend the cued stream and report occasional 1-back repeats. (C) Fourier-transform of the minimum norm estimate when attending to faces (2.0 Hz) or houses (1.5 Hz, P < 0.05, FDR-corrected).

Using MEG data only (13), the strongest activity evoked by the face tag was in the right fusiform gyrus, whereas the activity evoked by the house tag was more medially in the inferior-temporal cortex (IT) (Fig. 1C; figs. S1 and 2 for individual subjects and alternative source reconstruction approaches). These areas were roughly consistent with the locations of fusiform face area (FFA) and parahippocampal place area (PPA) determined previously in fMRI (1416). To increase the accuracy of localization in each subject, we added high-resolution fMRI localizers for FFA and PPA (Fig. 2, B and D, and fig. S3A), which were focused at the expected spots (Fig. 2F).

Fig. 2 Measures of attention localization.

(A) Average of the fMRI localizers for attention-related and (B) object-related ROIs (blue: FFA, red: PPA, P < 0.001, FW error-corrected). (C and D) All subjects' individual ROIs superimposed on a standardized brain surface (red: middle frontal gyrus; blue: inferior frontal gyrus; dashed line: BA44) and (E) on an inflated brain. (F) Comparison of the average MEG-based (filled) and fMRI-based (outlines) localization of face- (blue) and house-related (red) activity. (G) Spectral power in the participants' individual ROIs when attending houses (1.5 Hz, red) or faces (2.0 Hz, blue). (H) Lateralization of the attentional effects. (I) Phase lags of the neural activity to the physical stimuli on screen. (J) Systematic phase advancement from FFA/PPA to IFJ.

To identify other areas important for nonspatial attention, we contrasted the brain state when attending to one of the two superimposed object classes with a similarly demanding state that did not require attending to either object class. The attention-related fMRI localizers revealed consistent activation in the inferior frontal junction (IFJ) at the intersection of the inferior-frontal and precentral sulcus (1719) (Fig. 2, A, C, and E), with weaker and less-consistent signals in posterior-parietal and in inferior-temporal cortex (fig. S3C). A control experiment confirmed that IFJ's activation was indeed related to nonspatial attention, rather than simply memory (fig. S4).

Each subject's individual fMRI localizers were then used as regions of interest (ROIs) to guide the analysis of the MEG signals (see supplementary material for a description of the coregistration of fMRI and MEG). The modulation of sensory responses by attention in the tagging-frequency range is shown in Fig. 2G (fig. S5B for individual subjects). FFA and PPA had the strongest responses, with FFA more responsive to the attended face tag (t test, P < 0.001) and PPA more responsive to the attended house tag (t test, P < 0.01). Thus, object-specific attention modulates the sensory responses in FFA and PPA. Weaker sensory responses were found in region V1.

Although weaker in amplitude, sensory responses were also found in IFJ, and the attention effects were much stronger—there was a tagging frequency response only to the attended object (both t test, P < 0.001). Control regions in the FEF (localized in separate fMRI runs, fig. S3D), PC (localized in the attention-related fMRI experiment in some participants) and the frontal pole (anatomically defined) showed only minor and less consistent responses. The general pattern of results did not depend on the specific tagging frequency assignment to faces or houses (fig. S5).

In temporal cortex, both MEG and fMRI results showed a moderate tendency of lateralization: FFA to the right and PPA to the left hemisphere (Fig. 2H). The attentional effects in IFJ were slightly lateralized to the right.

We used Fourier transformations to extract the phase relation between the frequency-tag response and the stimulus on the screen, i.e., the latency of the sensory responses (Fig. 2I). The phase lag of IFJ (corresponding to 208 ms) was shifted by about 25 ms in comparison to FFA and PPA (188 and 171 ms) (Fig. 2J and fig. S5) (20), which likely accounts for transmission time and synaptic delays between areas.

To test for functional interactions among the areas, we analyzed coherence between the frontal and temporal ROIs (Fig. 3C) across a wide frequency spectrum, including frequency bands that were not time-locked to the stimuli (see time-frequency power spectra and an analysis of frequency nesting in fig. S6). The baseline-corrected coherence between IFJ-FFA (top) and IFJ-PPA (bottom) in the tagging-frequency range is shown in Fig. 3A. When attending into an area's preferred stimulus domain, that area became functionally connected with IFJ at the respective tagging frequency (both t test, P < 0.001), as responses in both areas were phase-locked to the attended stimulus but with different phase lags.

Fig. 3 Coherence measures of attention.

(A) Cross-area coherence spectra. (B) Attention indices, converted into changes of coherence. When attending to the preferred stimulus (faces for FFA, houses for PPA), coherence between IFJ and the respective temporal area increased at the respective tagging frequency and in a high-frequency band (70 to 100 Hz). Dots represent subjects' peaks of attentional modulations. (C) Schematic of the fronto-temporal connectivity. (D) Directionality measure of gamma phase-lags between IFJ and FFA/PPA in polar (right) and Cartesian (left) coordinates. In 9 of 12 subjects the phase-lag of FFA/PPA to IFJ increased linearly with increasing frequencies around the subject's peak of gamma coherence, consistent with IFJ cycles leading over FFA/PPA cycles. (E) Parcellation-based probability maps of frontal connectivity to FFA/PPA.

Coherence at frequencies higher than the tagging frequency was dominated by shared background coherence, as typical in MEG. To reduce the influence of background coherence, we analyzed patterns of domain-specific coherence by computing an attention index, theAIC = (attend preferred – attend unpreferred)/(attend preferred + attend unpreferred)which directly contrasts both attentional conditions and, therefore, is more sensitive to subtle attentional effects on coherence (Fig. 3B). When attending to faces (top, blue) coherence between IFJ and FFA increased not only at the tagging frequency (2.0 Hz) but also in a high-frequency band (70 to 100 Hz, both t test, P < 0.05). Similarly, when attending to the house stimuli (red), IFJ and PPA exhibited increased coherence, both at the tagging frequency (1.5 Hz) and in a high-frequency band (60 to 90 Hz, both t test, P < 0.01). In this high-frequency gamma range, the individual subjects varied considerably in their respective peak modulation frequency. As a check for whether the coherence in the gamma range resulted from common stimulus-locked onsets, we reran the analysis in a control data set, with shuffled trial order within each ROI (fig. S7C), which completely eliminated gamma coherence. Attentional modulations of coherence between IT and PC were weaker and nonsignificant (fig. S7D).

To test the directionality of the gamma-band coherence between IFJ and FFA/PPA, we analyzed the instantaneous phase lags between the two areas. Because portions of the signal in both sites are shared background coherence (due to electromagnetic field spread) or random noise, we first baseline-corrected the phase lag distributions to dissociate shared background coherence (which is simultaneous) and noise (which is uniformly distributed) from phase coherence that results from axonally transmitted synchronization (see supplementary methods and figs. S8 and S9). We then compared the residual phase lag distribution across a range of frequency bands around the subject's frequency of maximal coherence (peak ±10 Hz). In most subjects (9 out of 12), the baseline-corrected phase lags systematically increased as a function of frequency, consistent with IFJ leading FFA/PPA with a constant time lag of about 20 ms (SE = 6 ms) (Fig. 3D and figs. S10 and S11). The three other subjects seemed to have stronger bottom-up or balanced coherence (see supplementary materials).

To determine whether IFJ is anatomically connected with FFA or PPA, we computed maps of probabilistic connectivities (21) to the seed regions in FFA and PPA. When normalizing to the site of maximal activity within frontal cortex, both FFA- and PPA-connectomes revealed areas around IFJ to have the highest connection probabilities (see Fig. 3E and fig. S12).

The neural mechanism that enables attention to an object or feature seems intuitively more complex than spatial attention, which may only require a spatial-biasing signal that targets a relevant location. Yet the present study reveals some striking parallels in neural mechanisms: Prefrontal cortex seems to be a common source of top-down biasing signals, with FEF supplying signals for spatial attention and IFJ supplying signals for object or feature attention. With spatial attention, cells in FEF and visual cortex begin to oscillate together in the gamma frequency range, with FEF the “driver” in these oscillations (2). Here, we find that IJF—although it has delayed sensory responses—is also the “driver” in coupled gamma oscillations with FFA/PPA. In primates, coherent gamma oscillations in FEF are phase-shifted by about 10 ms compared with oscillations in area V4, which has been argued to account for the axonal conductance time and synaptic delays between the two areas (2). With the phase shift, spikes of FEF cells presumably affect cells in V4 at a time of maximum depolarization, which increases their impact. Here, a phase shift of 25 ms may allow for longer transmission times from IFJ to FFA and PPA in humans. Thus, spikes originating from IFJ may arrive in FFA and PPA respectively, and vice versa, at a time of maximum depolarization in the receiving area, magnifying their impact. The directing of IFJ signals to the FFA versus PPA may not be inherently more complex than shifting FEF signals between different locations in the visual field.

IFJ may include areas that function as general executive modules (22, 23). Also, IFJ is close to areas Ba45 and Ba46, homologs of which have been described in nonhuman primate recordings to encode information about object-categories in delayed match-to-sample tasks (23, 24). Indeed, the “attentional template” that specifies the relevant location or object in spatial or feature attention is hardly distinguishable from working memory for these qualities (9), which is known to involve prefrontal cortex (24). Coupled interactions between prefrontal areas and visual areas (2531) could underlie many cognitive phenomena in vision, with shared neural mechanisms but variations in the site of origin and the site of termination.

Supplementary Materials

www.sciencemag.org/content/344/6182/424/suppl/DC1

Materials and Methods

Supplementary Text

Figs. S1 to S13

References (3278)

References and Notes

  1. Acknowledgments: We thank J. Liang, D. Pantazis, M. Hämäläinen, D. Dilks, D. Osher, Y. Zhang, C. Triantafyllou, S. Shannon, S. Arnold, C. Jennings. Supported by NIH (P30EY2621) and NSF (CCF-1231216), both to R.D.
View Abstract

Navigate This Article