Cross-Language Analysis of Phonetic Units in Language Addressed to Infants

See allHide authors and affiliations

Science  01 Aug 1997:
Vol. 277, Issue 5326, pp. 684-686
DOI: 10.1126/science.277.5326.684


In the early months of life, infants acquire information about the phonetic properties of their native language simply by listening to adults speak. The acoustic properties of phonetic units in language input to young infants in the United States, Russia, and Sweden were examined. In all three countries, mothers addressing their infants produced acoustically more extreme vowels than they did when addressing adults, resulting in a “stretching” of vowel space. The findings show that language input to infants provides exceptionally well-specified information about the linguistic units that form the building blocks for words.

The emergence of language in a child depends on linguistic input. Socially isolated children (1), and profoundly deaf children who experience neither oral nor manual language (2), do not acquire language. Recent findings highlight the impact of natural language input on normally developing infants. For example, cross-cultural studies of speech perception show that simply listening to ambient language results in infants' acquisition of information about the phonetic, phonotactic, and prosodic regularities of their native language (3, 4). This learning alters infants' perceptual systems, tuning them to the properties of their native language before word learning (5). Moreover, in recent studies of speech production, 5-month-old infants were shown to produce specific speech sounds after short-term exposure to them in a laboratory setting, which suggests that language listening also affects speech production at an early age (6).

Because early linguistic experience alters speech perception, theorists' attention has focused on language input to infants. Research has established that speech directed to infants (often termed “parentese”) is syntactically and semantically simpler than speech directed to adults (7). Moreover, cross-cultural studies have shown that infant-directed speech has a unique acoustic signature: It is produced with a higher fundamental frequency (pitch), exaggerated intonation contours, and a slower cadence (8). Laboratory tests show that, when given a choice, young infants prefer infant-directed over adult-directed utterances, and this preference is governed by the intonational features of infant-directed speech (9).

Thus, linguistic input to infants is modified syntactically, semantically, and prosodically. A remaining question is whether the phonetic units themselves are modified in infant-directed speech in a way that might enhance learning (10). Phonetic units in adult-directed speech are often poorly specified. Vowel and consonant articulations undershoot their intended targets (11), resulting in an overlap in the acoustic cues specifying distinct categories (12, 13). The phonetic units of adult-directed speech may thus provide a poor signal from which to learn, contributing to the argument that language input to the child underspecifies the information needed for language acquisition (14).

We examined natural language input to infants in the United States, Russia, and Sweden. The results show that across all languages, there is an alteration of the phonetic units in infant-directed speech. Parents addressing their infants produce vowels that are acoustically more extreme, resulting in an expanded vowel space, one that is acoustically “stretched.”

Ten native-speaking women were audiotaped in two experimental conditions in each of the three countries. In one condition, women were speaking with their 2- to 5-month-old infants (15). In the other, the same women spoke to an adult native speaker. Native-language words containing the vowels /i/, /a/, and /u/ were preselected for analysis in the three languages (16). Acoustically, the space encompassing vowels forms a “vowel triangle” whose points are determined by the vowels /i/, /a/, and /u/. These three vowels (termed “point” vowels) occur in all the world's languages (17).

The hypothesis was that the formant frequencies (18) of infant-directed (I) vowels would differ significantly from those of adult-directed (A) vowels. Target words were isolated from each tape-recorded conversation by means of computer-editing techniques. All target words except those obscured by noise or overlapping conversation were digitally sampled for spectrographic analysis (19). For each word, 13 acoustic measures were taken: Vowel formant frequencies (F1, F2, and F3) and fundamental frequency (pitch) measures were made at three locations (onset, center, and offset of the vowel) (20); vowel duration was also measured. Across all languages, 30,719 measurements were made on 2363 words (1330 I words and 1033 A words). The total number of words included 188 (I) and 141 (A) in English, 175 (I) and 135 (A) in Russian, and 967 (I) and 757 (A) in Swedish.

The results confirm the hypothesis that infant-directed speech exhibits a change in the phonetic units of language when compared with adult-directed speech. Across all three languages, mothers produced acoustically more extreme vowels when addressing their infants, resulting in an expansion of the vowel triangle during infant-directed speech (Fig. 1). Mothers did not simply raise all formant frequencies when speaking to their infants, as they might have done if they were mimicking child speech. Rather, formant frequencies were selectively increased or decreased to achieve an expansion of the acoustic space encompassing the vowel triangle.

Figure 1

Vowel triangles formed by the “point” vowels, /i/ (green), /a/ (red), and /u/ (blue), in infant-directed (solid circles) and adult-directed (open circles) speech in three languages—English, Russian, and Swedish. Each data point represents the coordinate of the first two formant frequencies of a vowel. A universal stretching of the vowel triangle is observed in infant-directed (solid line) relative to adult-directed (dashed line) speech.

Vowel triangle areas in the infant- and adult-directed conditions were compared for each subject. The results were highly consistent. For each of the 30 mothers, the area of the vowel triangle was greater in the I condition than in the A condition (P < 0.0001, by binomial test). A Friedman two-way analysis of variance (ANOVA) by ranks (21) on the effect of addressee (I versus A), with language (English, Russian, or Swedish) as the blocking factor, confirmed that vowel triangle areas were significantly larger in the I condition (χr 2 = 39.9, P < 0.0001). The degree of vowel triangle expansion was substantial. On average, mothers addressing their infants expanded the vowel triangle by 92% (English, 91%; Russian, 94%; Swedish, 90%). The ratios of mothers' area measures (I/A) across the three languages did not differ (Kruskal-Wallis = 0.38, P = 0.83), suggesting that across languages, mothers stretch the vowel triangle to a similar degree.

Analysis of the change in individual formant frequencies showed that they were increased or decreased as necessary to achieve a stretching of the vowel triangle (22). The results for American English mothers showed increased F2 in /i/, decreased F2 in /u/, and increased F1 and F2 in /a/ (Fig. 1A). Russian mothers showed increased F2 in /i/, decreased F2 in /u/, and increased F1 in /a/ (Fig. 1B). Swedish mothers showed increased F2 and decreased F1 in /i/, decreased F1 in /u/, and increased F1 and F2 in /a/ (Fig. 1C). The range of formant values was greater in I speech in all languages. As expected, significant increases in fundamental frequency and vowel duration were observed in I speech in all languages.

Does a stretched vowel triangle benefit infants? We hypothesize three ways in which it could do so. First, an expanded vowel triangle increases the acoustic distance between vowels, making them more distinct from one another. In recent studies, language-delayed children showed improvements when listening to speech in which between-category phonetic differences were increased (23). Normally developing infants have been shown to discriminate smaller differences than those provided by an expanded vowel triangle (24, 25), but may nonetheless benefit similarly from the enhanced acoustic differences provided in infant-directed speech.

Second, to achieve the stretching, mothers produce vowels that go beyond those produced in typical adult conversation. From both an acoustic and articulatory perspective, these vowels are “hyperarticulated” (26). Hyperarticulated vowels are perceived by adults as “better instances” of vowel categories (27, 28), and laboratory tests show that when listening to good instances of phonetic categories, infants show greater phonetic categorization ability (29). Our study shows that hyperarticulated vowels are a part of infants' linguistic experience and raises the possibility that they may play an important role in the development of infants' vowel categories.

Third, expanding the vowel triangle allows mothers to produce a greater variety of instances representing each vowel category without creating acoustic overlap between vowel categories. Greater variety may cause infants to attend to non–frequency-specific spectral features that characterize a vowel category, rather than to any particular set of frequencies the mother uses to produce a vowel (30). As shown in Fig. 2, converting the formant values to spectral features (31) in mels (32) shows that infant-directed speech maximizes the featural contrast between vowels. This is especially critical for infants because they cannot duplicate the absolute frequencies of adult speech—their vocal tracts are too small (33). To speak, infants must reproduce the appropriate spectral features in their own frequency range (6). We posit that early in development, representations of speech stored in memory encode such abstract spectral dimensions. According to this view, linguistic input induces infants to attend to features that (i) allow phonetic units spoken by different talkers to be categorized and (ii) provide a non–frequency-specific metric that reveals how equivalent speech units can be produced by the infant's vocal tract.

Figure 2

Formant measures converted to spectral features (in mels) for infant-directed and adult-directed speech. Spectral features describe the acoustic components of vowels in a non–frequency-specific metric (31), and mels take into account the fact that at higher frequencies, larger differences are necessary to detect change (32). The vowel /i/ has component frequencies that are broadly distributed across the spectrum (“diffuse”) and relatively high (“acute”), whereas component frequencies in the vowel /a/ are acute but more concentrated (“compact”) and components of /u/ are maximally low (“grave”). The formula for calculating the compact-diffuse feature is F2 − F1; for the grave-acute feature, (F1 + F2)/2.

Language development includes not only the acquisition of a complex grammar, but also the acquisition of a phonological system that allows differences in meaning to be conveyed. The acoustic forms of speech are highly variable, changing with factors that include speaker gender and identity, speaking rate, and the phonological context of the sound (34), which makes sorting ambient language sounds into phonetic categories a complex task. Our results suggest that infant-directed speech assists in this process by delivering information about the sound system of the infant's native language in an exaggerated form. The exaggerated form serves two functions: It more effectively separates sounds into contrasting categories, and it highlights the parameters on which speech categories are distinguished and by which speech can be imitated by the child.

Our results contribute to an emerging view of the role of linguistic input in language development in the child and the type of learning it induces (5). According to this emerging view, language input is not a trigger for innately stored information. Moreover, the developmental change that ensues, given language input, is not a process that depends on Skinnerian reinforcement; infants' learning of linguistic regularities shown in recent studies (3,4, 35) cannot be explained on the basis of reinforcement. Language input provides a rich and detailed source of information that instigates, before word learning, a process of species-specific mapping of information by the brain, a process that alters the infant's perceptual and perceptual-motor system to conform to a specific language.

Natural language input is a reliable feature of every typically developing child's experience. Our findings demonstrate that language input to infants has culturally universal characteristics designed to promote language learning. These characteristics are likely to be exploited by infants' developing neural systems.


View Abstract

Navigate This Article