Functional Neuroimaging of Speech Perception in Infants

See allHide authors and affiliations

Science  06 Dec 2002:
Vol. 298, Issue 5600, pp. 2013-2015
DOI: 10.1126/science.1077066


Human infants begin to acquire their native language in the first months of life. To determine which brain regions support language processing at this young age, we measured with functional magnetic resonance imaging the brain activity evoked by normal and reversed speech in awake and sleeping 3-month-old infants. Left-lateralized brain regions similar to those of adults, including the superior temporal and angular gyri, were already active in infants. Additional activation in right prefrontal cortex was seen only in awake infants processing normal speech. Thus, precursors of adult cortical language areas are already active in infants, well before the onset of speech production.

The adult human brain exhibits anatomical and functional specialization for speech processing (1–5). To understand this adult organization, one must ultimately clarify how it emerges in the course of development through a combination of brain maturation constraints and environmental influences. Behavioral studies in infants indicate that considerable language learning is already taking place in the first year of life in the domains of phonology, prosody, and word segmentation [reviewed in (6)]. However, little is known about the brain mechanisms underlying those abilities. At present, the only evidence comes from recordings from event-related potentials (ERPs). They indicate that the temporal lobes contain neural circuits for phoneme discrimination, which become attuned to the mother language in the first year of life (7–9). ERPs, however, do not provide spatially accurate information on the active brain areas. Here, we show that functional magnetic resonance imaging (fMRI) can be used to study the functional organization of the infant brain. Provided that precautions are taken to avoid introducing metallic objects in the magnetic field, magnetic resonance is a safe method that has been used in neuropediatric practice and research for the last 20 years with a variety of age ranges, including healthy infants (10–13) and even fetuses (14).

We collected fMRI images from 20 healthy nonsedated infants (2 to 3 months old) while they listened to 20 s of speech stimuli alternating with 20 s of silence (15). On alternate blocks, the same excerpts from a highly intonated female voice reading a children's book were presented in the infant's native language (French), with the recording playing either forward or backward. Backward speech violates several segmental and suprasegmental phonological properties that are universally observed in human speech (4, 16). Behavioral studies indicate that infants are sensitive to these properties. For instance, 4-day-old neonates and 2-month-old infants discriminate sentences in their native language from sentences in a foreign language, but this performance vanishes when the stimuli are played backward (17–19). We therefore expected that forward speech would elicit stronger activation than backward speech in brain areas engaged in the recognition of segmental and suprasegmental properties of the native language. Nonetheless, forward and backward speech both contain fast temporal auditory transitions and phonetic information conveyed by temporally symmetrical phonemes. Thus, brain areas sensitive to those properties were expected to be jointly activated by those two conditions.

To obtain reliable fMRI images in nonsedated infants, we took several precautions to ensure their comfort, to minimize head motion and noise exposure, and to permit constant experimenter monitoring (15). Data processing included rejection of images with severe motion artifacts, realignment of the remaining images, and incorporation of the movement parameters as regressors in a linear model of the blood oxygen level–dependent (BOLD) response appropriate for temporal sequences with occasional missing data. An important issue was the determination of the hemodynamic response function (HRF) in infants, which may differ in the immature brain. The small number of previous fMRI studies on this topic, all performed with sedated or sleeping infants and using various stimuli, are contradictory. Some report a normal adult response (13, 14); others suggest that the response can reverse at an age that varies from region to region (10–12). To characterize the latency and sign of the HRF in our data set, we devised a sinusoidal model that allowed us to detect any periodic response at the frequency imposed by the periodic alternation of sound and silence in the stimuli, and to measure its activation delay relative to sound onset (15). Within the temporal resolution of the present block design, most activated voxels showed a delay of about 5 s, a time course compatible with the normal adult HRF (Fig. 1). Our use of ecologically natural stimuli in nonsedated infants may have contributed to the observation of adult-like hemodynamics.

Figure 1

Characteristics of the hemodynamic response to sound (forward and backward speech) in 2- to 3-month-old infants. (A) Sample nonaveraged data recorded from a left temporal voxel during a single run, with forward (Fw) and backward (Bw) speech periods alternating with silent periods. fMRI measurements (blue) were modeled by a sinusoidal function with additional regressors accounting for head motion (red curve). (B) Distribution of phase lags for all voxels with a significant sinusoidal activation (P < 0.001), cumulated across all 20 infants. The most frequent response occurred with a delay of 3 to 7 s between activation onset and sound onset, similar to the adult HRF. Another smaller peak approximately a half-period away indicated the presence of occasional inverse BOLD responses, which could be due either to deactivation during sound presentation or to a normal activation by sound offset. However, deactivation responses were never significant in the random-effect group analyses. (C) Activation induced by sound, observed in an individual infant (P < 0.001) when fitting the data using an idealized adult HRF. The figure shows a transparent brain view, an axial slice at the level of the superior temporal cortices, and the response curve at the indicated location, averaged across all sound presentations (mean ± SE). The color scale indicates the value of the t test assessing the significance of the correlation of the observed data with the model (d.f., degrees of freedom). The infant was asleep and activation was largely confined to bilateral superior temporal cortices.

We then identified active brain areas by convolving the standard adult HRF with the time course of the forward or backward speech signal. A random-effect analysis of the 20 infants revealed stimulus-induced activation in a large extent of the left temporal lobe (Fig. 2A). Activation ranged from the superior temporal gyrus, encompassing Heschl's gyrus, to surrounding areas of the superior temporal sulcus and the temporal pole. Symmetrical areas of the right temporal lobe also showed a small activation, which did not remain significant after correction for multiple comparisons (see table S1). Activation was significantly greater in the left than in the right temporal lobe at the level of the planum temporale (Fig. 2B).

Figure 2

Localization of activation in random-effect group analyses (voxel P < 0.01, clusterP < 0.05, corrected). (A) Activation evoked by sound presentation (forward and backward speech), relative to silent periods, at three different axial levels in the left temporal lobe (see table S1 for coordinates of activation peaks). (B) Statistical map of asymmetry of activation by sound, showing that the planum temporale was significantly more activated in the left hemisphere than in the right. (C) Statistical map of the comparison between forward and backward speech, showing greater activation by forward speech in the left angular gyrus.

We then studied the differences between forward and backward speech. The left angular gyrus and the left mesial parietal lobe (precuneus) were significantly more activated by forward speech than by backward speech (Fig. 2C). Conversely, no region showed greater activation by backward speech than by forward speech. To investigate the effect of wakefulness on these responses, we compared the activation patterns of six infants who stayed awake during the entire session with those of five infants who were deeply asleep (in the remaining infants, the state of wakefulness was too variable to permit unambiguous classification). Although the main effect of wakefulness on sound-induced activation was not significant, a significant interaction between the nature of the stimulus and the infants' wakefulness was observed in the right dorsolateral prefrontal cortex (DLPFC). This region showed a greater activation by forward speech than by backward speech in awake infants but not in sleeping infants (Fig. 3). The right lateralization of this activation was significant. The converse interaction revealed a greater activation by backward speech than by forward speech bilaterally in the posterior part of the superior temporal sulci, again only in awake infants.

Figure 3

Interaction between wakefulness and the linguistic nature of the stimuli (voxel P < 0.01, cluster P < 0.05, corrected). This comparison isolated a right dorsolateral prefrontal region that showed greater activation by forward speech than by backward speech in awake infants, but not in sleeping infants.

Our results show that the infant cortex is already structured into several functional regions. As in adults (1–4), listening to speech activates a large subset of the temporal lobe, with a significant left-hemispheric dominance. This is consistent with the left lateralization observed in ERPs (7, 8) and in dichotic listening during syllable discrimination tasks in infants (20). An anatomical asymmetry is detectable in the planum temporale as early as 31 weeks of gestation (21). Our fMRI results indicate that this anatomical difference supports an early functional asymmetry in the processing capacities of the two hemispheres. It is not yet known, however, whether this asymmetry reflects an early specialization for speech perception or a greater responsivity of the left temporal cortex to any auditory stimulus (or perhaps to any stimulus with fast temporal changes).

In adults, listening to the native language (versus listening to backward speech or a foreign language) induces greater activation all along the left superior temporal sulcus, extending posteriorly into the left angular gyrus (1–3). We did not observe any difference between forward and backward speech in the infant temporal lobe. This suggests that this area is undergoing changes during infancy and has not yet acquired its full competence for the native language by 3 months. However, we found a significant advantage for the native language in the left angular gyrus, the left precuneus, and, in awake babies only, the right prefrontal cortex. In adults, the left angular gyrus shows greater activation when subjects hear words than when they hear nonwords (4, 5) or when they hear sentences in a known language relative to hearing sentences in an unknown language or backward speech (1–3). Moreover, the precuneus and DLPFC are activated, often with a right lateralization, when adults retrieve verbal information from memory (22–24). Activation of both regions in 3-month-olds may indicate the early engagement of active memory retrieval mechanisms. This would fit with behavioral evidence that infants of that age have already memorized the prosodic contours of their native language (17, 19), although they may not remember single words until the age of 7 months (6).

Two mechanisms of language acquisition are classically opposed. According to one view, the human brain is equipped with genetically determined mechanisms of language processing that endow the infant with an early linguistic competence without which language acquisition would be impossible (25). For others, the infant brain is initially immature and plastic, and exposure to speech inputs progressively shapes its organization through domain-general mechanisms of learning and plasticity (26). Without resolving this debate, our results provide evidence that should be accommodated by any model of language development. First, the areas activated by the native language are not confined to primary auditory cortices, even during the first few months of life. Second, neither do they extend widely into other cortical territories such as the visual areas; rather, they remain confined to regions that are similar to those observed in adults in both their localization and their lateralization. Third, frontal cortex already exhibits functional specificity in infants and thus can no longer be assumed to be silent in the first months of life. The delayed synaptogenesis and myelination of this area, and its immature level of metabolic activity, need not imply that it does not contribute to early cognitive processes (27). Overall, the prefrontal activation, in coordination with a left-lateralized temporoparietal activation partially similar to that found in adults, favors a description of language acquisition as a progressive differentiation of a preconstrained network of left-hemispheric regions under the influence of active mechanisms of attention and effort (28).

Supporting Online Material

Materials and Methods

Table S1

  • * To whom correspondence should be addressed. E-mail: ghis{at}


Stay Connected to Science

Navigate This Article