Sequential Processing of Lexical, Grammatical, and Phonological Information Within Broca’s Area

See allHide authors and affiliations

Science  16 Oct 2009:
Vol. 326, Issue 5951, pp. 445-449
DOI: 10.1126/science.1174481

Seeing the Brain's One, Two, Three

Taking advantage of the rare opportunity to record neuronal activity in the human brain using intracranial electrodes, Sahin et al. (p. 445; see the Perspective by Hagoort and Levelt) document the spatial and temporal pattern of neuronal populations within Broca's area as patients thought of a single word, changed its tense (for verbs) or number (for nouns), and articulated the word silently. For these three stages, they detected activity at 200, 320, and 450 milliseconds, moving in a caudal to rostral direction. These data fit neatly within the roughly 600 milliseconds required for the onset of speech and map the distinct neural computations within an area of the brain, known for almost a century and a half, as important for the production of language.


Words, grammar, and phonology are linguistically distinct, yet their neural substrates are difficult to distinguish in macroscopic brain regions. We investigated whether they can be separated in time and space at the circuit level using intracranial electrophysiology (ICE), namely by recording local field potentials from populations of neurons using electrodes implanted in language-related brain regions while people read words verbatim or grammatically inflected them (present/past or singular/plural). Neighboring probes within Broca’s area revealed distinct neuronal activity for lexical (~200 milliseconds), grammatical (~320 milliseconds), and phonological (~450 milliseconds) processing, identically for nouns and verbs, in a region activated in the same patients and task in functional magnetic resonance imaging. This suggests that a linguistic processing sequence predicted on computational grounds is implemented in the brain in fine-grained spatiotemporally patterned activity.

Within cognitive neuroscience, language is understood far less well than sensation, memory, or motor control, because language has no animal homologs, and methods appropriate to humans [functional magnetic resonance imaging (fMRI), studies of brain-damaged patients, and scalp-recorded potentials] are far coarser in space or time than the underlying causal events in neural circuitry. Moreover, language involves several kinds of abstract information (lexical, grammatical, and phonological) that are difficult to manipulate independently. This has left a gap in understanding between the computational structure of language suggested by linguistics and the neural circuitry that implements language processing. We narrow this gap using a technique with high spatial, temporal, and physiological resolution and a task that distinguishes three components of linguistic computation.

According to linguistic analyses, the ability to identify words, combine them grammatically, and articulate their sounds involves several kinds of representations, with logical dependencies among them (1, 2). For example, to pronounce a verb in a sentence, one must determine the appropriate tense given the intended meaning and syntactic context (e.g., “walk,” “walks,” “walked,” or “walking”). One must identify the particular verb, which specifies whether to use a regular (e.g., “walked”) or irregular (e.g., “went”) form. In addition, one must unpack the phonological content of the verb and suffix to implement three more computations: phonological adjustments in the sequence of phonemes (e.g., inserting a vowel between verb and suffix in “patted” but not in “walked”), phonetic adjustments in the pronunciation of the phonemes (such as the difference between the “d” in “walked” and “jogged”), and conversion of the phoneme sequence into articulatory motor commands.

This logical decomposition does not entail that each kind of representation corresponds to a distinct stage or circuit in the brain. In many neural-network models, the selection of tense, discrimination of regular from irregular inflection, and formulation of the phonetic output are computed in parallel and in one time-step within a single distributed network (3, 4). Others contain loops and feedback connections, propagate probabilistic constraints, and iteratively settle into a globally stable state, with no fixed sequence of operations (5). Even stage models may incorporate cascades where partial information from one stage begins to feed the next before its computation is complete (6). Nonetheless, the most comprehensive model of speech production, developed by Levelt, Roelofs, and Meyer (LRM), maximizes parsimony and falsifiability by implementing linguistic operations as discrete ordered stages, eschewing feedback, loops, parallelism, or cascades (7). They posit stages for lexical retrieval (which they associate with the left middle temporal gyrus at 150 to 225 ms after stimulus presentation), grammatical encoding (locus and duration unknown), phonological retrieval (posterior temporal lobe, 200 to 400 ms), phonological and phonetic processing (Broca’s area, 400 to 600 ms), self-monitoring (superior temporal lobe, beginning at 275 to 400 ms but highly variable in duration), and articulation (motor cortex) (8, 9).

Current evidence, however, leaves considerable uncertainty about the localization and timing of these components, especially grammatical processing. Although clinical studies report double dissociations in which a patient is more impaired in grammar than phonology or vice versa (10), in most studies both abilities are linked to similar regions in the left inferior prefrontal cortex, particularly Broca’s area (11). Although Broca’s area itself has been identified as the seat of phonology, grammar, and even specific grammatical operations (1214), lesion and neuroimaging studies have tied it to a broad variety of linguistic and nonlinguistic processes (15). This uncertainty may be a consequence of the coarseness of current measurements. It remains possible that grammatical and other linguistic processes are processed distinctly, even sequentially, in the microcircuitry of the brain, but techniques that sum over seconds and centimeters necessarily blur them.

In a rare procedure, electrodes are implanted in the brains of patients with epilepsy for clinical evaluation. Recordings of intracranial electrophysiology (ICE) from unaffected brain tissue during periods of normal activity can provide millisecond resolution in time with millimeter resolution in space. We recorded local field potentials (LFP) from multicontact depth electrodes in three right-handed patients (ages 38 to 51, with above-average language and cognitive skills) whose electrodes were located in and around Broca’s area while they read words verbatim or converted them to an inflected form (past/present or singular/plural) (Figs 1 and 2) (16). The task engages inflectional morphology, which is like syntax in combining meaningful elements according to grammatical rules, but the units are shorter and semantically simpler, making fewer demands on working memory and conceptual integration, and thus allowing greater experimental control. We applied the high resolution of ICE to a task that distinquishes three linguistic processes to investigate the spatiotemporal patterning of word production in the brain.

Fig. 1

Experimental design. (A) Structure of trials. (B) Experimental conditions, example trials, and required psycholinguistic processes. (C) Hypothesized patterns of neural activity by condition, for inflectional and phonological processing.

Fig. 2

(A) Main results: sequential processing of lexical, grammatical, and phonological information in overlapping circuits. (Top) Neural activity recorded from several channels in Broca’s area (patient A, Brodmann area 45) shows three LFP components that were consistently evoked by the task (~200, ~320, and ~450 ms). (Bottom) The ~200-ms component is sensitive to word frequency but not word length, suggesting that it indexes a cognitive process such as lexical identification, not simply perception. Stacked waveforms (top and bottom) adopt the axes noted on the first waveform. (B) At ~320 ms, the LFP pattern suggests inflectional processing. (C) At ~450 ms, in a channel 5 mm distant, the complementary pattern suggests phonological processing. (Inset) MRI slices from this patient, annotated with the anatomical location of A4, the contact in common to the two channels reported here. Statistical significance: **** (P < .0001), *** (P < .001), ** (P < .01) (t test, one tail, two-sample, equal variance). Box arrows (bottom) indicate linguistic processing stages, which may be interposed among other stages not addressed here.

In each trial, participants saw either the instruction “Repeat word” (the “Read” condition) or a cue that dictated an inflected form (“Every day they ____”; “Yesterday they ____”; “That is a ____”; “Those are the ____”). Next, they saw a target word and produced the appropriate form silently (Fig. 1A) (16). The 240 target words were presented in uninflected form in the phrase “a [noun]” or “to [verb]” (17) (Fig. 1B). Half the targets were regular (e.g., “link”/“linked”) and half irregular (e.g., “think”/“thought”), to ensure that participants had to access the word rather than automatically appending the regular suffix (18).

The Null-Inflect (N) condition requires an inflected form of the verb (present tense) or noun (singular), yet these forms are not overtly marked and thus require the same output to be pronounced as in the Read (R) condition. The difference between these conditions thus implicates the process of inflection. In contrast, the Overt-Inflect (O) condition (past-tense verb or plural noun) requires that a suffix be added (regular) or the form changed (irregular). It thus differs from the Null-Inflect condition in requiring computation of a different phonological output (Fig. 1B). (The label “phonological” subsumes phonological, phonetic, and articulatory processes.) The design was fully crossed, with trials presented in pseudorandom order.

To assess whether these patients’ language systems were organized normally, and to correlate LFP with fMRI, we performed fMRI in two of the patients before their electrodes were placed. Their activation patterns were indeed similar to 18 healthy controls (Fig. 3, A to C) [for other fMRI results, see (19)]. Most of the 168 bipolar channels from which we recorded (across patients) were in fMRI-active regions (Fig. 3, A to G). LFP that was significantly correlated with the task (P < .001, corrected) [see (16)] was recorded in about half (86 of 168) of the channels (19 channels in patient A, 37 in B, and 30 in C). Of these channels, 49 (57%) were within Broca’s area or the anterior temporal lobes (16 in patient A, 19 in B, 14 in C). Of the 49 channels, 26 were within Broca’s area, and the majority (20 of 26) yielded a strong triphasic (three-component) LFP waveform (9 in patient A, 8 in B, 3 in C). The mean peaks occurred ~200, ~320, and ~450 ms after the target word onset (Fig. 2A), and this timing was consistent across patients (Fig. 4, A and B, and figs. S1, S4, and S5).

Fig. 3

Localization of fMRI responses, depth electrodes, and neural generators. (A) fMRI in 18 controls, contrasting activity for all task conditions with visual-fixation baseline periods. The task engages classic language areas (Broca’s, speech-related motor cortex, medial supplementary motor area, anterior cingulate, and superior temporal lobe) and visual-reading areas (visual word form area and primary and ventral visual cortex). Classic Broca’s area is circled. Thresholding and correction at a 0.01 false discovery rate (16). Scale as in (B). (B and C) Single-patient fMRI (identical contrast) reveals similar activations in both patients and controls. Surfaces are inflated to reveal activation within sulci. (D) Coregistered MRI and computerized tomography scan of patient C showing depth probes inserted through the skull. (E) Intra-operative photo showing left perisylvian language areas. Letters, insertion points of the probes; dashed lines, surface projections of their intracortical trajectories. Putative Brodmann areas are labeled. (F) Postimplantation MRI reveals that probe B traverses Broca’s area in the posteromedial process of IFG pars opercularis facing the insula, and preimplantation fMRI (G) demonstrates that the region was activated by the task in this patient. (H) Location of probe A, in Broca’s area traversing IFG pars triangularis within the inferior frontal sulcus. (I and J) Schematic of neural dipoles near probe A that generated the LFP components, hypothesized from their polarities, amplitudes, and locations (see fig. S3). Schematic gyral outline corresponds to the gyral trace superimposed on the MRI in (H).

Fig. 4

Additional features of the triphasic waveform support the lexical-inflectional-phonological progression. (A) Triphasic activity is specific to Broca’s area and is consistent across patients. All-condition average waveforms from task-active channels in each patient are superimposed (scaled in amplitude to a single channel in each region and standardized in polarity). (B) Noun (black) and verb (red) inflection (Null and Overt combined) involved nearly identical neural activity, across sites and patients. Standardized across channels in polarity. (C) The ~450-ms component, which is sensitive to phonological differences among inflectional conditions, is also sensitive to phonological complexity (syllable count) of the target word (P < 0.01, corrected). (D) Neural activity in Broca’s area is evoked primarily when processing the target word (when the linguistic processing of interest should occur), not the cue (35).

The three LFP components showed signatures of distinct linguistic processing stages (Fig. 2, A to C). The ~200-ms component appears to reflect lexical identification. The timing converges with when word-specific activity has previously been recorded in the visual word form area (VWFA) [(20, 21), but see (22)] and when the VWFA has been shown to become phase-locked with Broca’s area (23). Furthermore, the magnitude of the component varied with word frequency, which indexes lexical access (24). Specifically, rare words (frequency 1 to 4) yielded a significantly higher amplitude [t(204) = 3.32, P < 0.001] than common words (frequency 9 to 12) (Fig. 2A) (25). Word frequency is inversely correlated with word length, but the present effect is not a consequence of length: We found no difference at ~200 ms between short (2 to 4 characters) and long (6 to 11 characters) words (Fig. 2A), nor a difference between one-morpheme and two-morpheme responses (26). Later components were not affected by frequency. Finally, consistent with the fact that lexical identification is required by all three inflectional conditions, the ~200-ms component did not vary across them. Primary lexical access is generally associated with temporal cortex rather than Broca’s area (8), so this component may index delivery of word identity information into Broca’s area for subsequent processing, consistent with anatomic and physiological evidence that the two areas are integrated (23, 27). Although word-evoked activity in this latency range has previously been localized to Broca’s area with LFP (28) and magnetoencephalography (29), it has not been demonstrated to be modulated by lexical frequency.

The subsequent two LFP components showed activity patterns predicted for grammatical and phonological processing, respectively (Fig. 2, B and C). In the ~320-ms component (Fig. 2B), the Overt-Inflect and Null-Inflect conditions significantly differed from the Read condition but not from each other. Thus, the ~320-ms component is modulated by the demands of inflection (required by Overt-Inflect and Null-Inflect but not Read), but not by the demands of phonological programming (required in Overt-Inflect but not in Null-Inflect or Read; see Fig. 1C). In contrast, in a component appearing at ~450 ms, Overt-Inflect did differ from the Null-Inflect and Read conditions, which did not differ from each other (Fig. 2C). This contrasting pattern indicates that the ~450-ms component reflects phonological, phonetic, and articulatory programming, independently confirmed by its sensitivity to the number of syllables (Fig. 4C). Both components were recorded from Broca’s area in all patients (fig. S1), and specifically in patient A (Fig. 2) from the inferior frontal gyrus (IFG) pars triangularis deep in the inferior frontal sulcus. The ~320-ms component was recorded near the fundus; the ~450-ms component was recorded 5 mm more lateral along the sulcus within a subgyral fold that faced the fundus (Fig. 3I and fig. S1A). This region is often considered part of area 45 [but see (30)].

The pattern of sign inversions across neighboring bipolar channels in space (Fig. 2A, top) indicates that the generators of the LFP components were local (fig. S3), and the differences in inversions across components in time indicate that their generators were not identical (Fig. 3, I and J). Thus, the overall LFP pattern suggests a fine-grain spatiotemporal progression of lexical, grammatical, and phonological processing within Broca’s area during word production.

The triphasic pattern in all patients was found exclusively in Broca’s area (Fig. 4A). Outside Broca’s area, other patterns prevailed; for example, temporal lobe sites showed a slow and late monophasic component at 500 to 600 ms (Fig. 4A, bottom, and fig. S4, F and G) (31), possibly reflecting self-monitoring (7, 8). The condition differences for each component were also consistent across patients, replicating the temporal isolation of grammatical (~320 ms) from phonological (~450 ms) processing (fig. S1). The word-frequency effect on the ~200-ms component was significant in patients A and B and marginal (P = 0.06) in patient C (fig. S2). The ~200-, ~320-, and ~450-ms components were consistent in their timing across patients, although the keypress reaction times, which require the self-monitoring process, varied among patients and conditions (fig. S6).

Although nouns and verbs differ linguistically and neurobiologically (32, 33), the neuronal activity they evoked was similar (Fig. 4B). Furthermore, the patterning across inflectional conditions was the same for nouns and verbs (34). These parallels suggest that words from different lexical classes feed a common process for inflection.

Additional evidence that the LFP patterns reflect inflectional computation is that they are triggered by presentation of the target word, not the cue, even though the cues contain more visual and linguistic elements (Fig. 4D) (35). Furthermore, activity evoked by the cue showed little sensitivity to the inflectional conditions.

The LFP patterns are consistent with the computational nature of the task and with independent estimates of the timing of its subprocesses. Inflectional processing cannot occur before the word is identified (especially as to whether it is regular or irregular), and phonological, phonetic, and articulatory processing cannot be computed before the phonemes of the inflected form have been determined. Word identification has been shown to occur at 170 to 250 ms (8, 29, 36), consistent with the ~200-ms component, and syllabification and other phonological processes at 400 to 600 ms, consistent with the phonological component at 400 to 500 ms (8). In naming tasks, speech onset occurs at around 600 ms (8), which is consistent with the self-monitoring behavioral responses we recorded (fig. S6). Self-monitoring has been localized to the temporal lobe (8), where we recorded LFPs in the post-response latency range that may correspond to previously described scalp event-related potentials (37). Working backward from 600 ms, we note that motor neuron commands occur 50 to 100 ms before speech, placing them just after the phonological component we found to peak at 400 to 500 ms (38). In sum, the location, behavioral correlates, and timing of the components of neuronal activity in Broca’s area suggest that they embody, respectively, lexical identification (~200 ms), grammatical inflection (~320 ms), and phonological processing (~450 ms) in the production of nouns and verbs alike.

Although the language processing stream as a whole surely exhibits parallelism, feedback, and interactivity, the current results support parsimony-based models such as LRM (7), in which one portion of this stream consists of spatiotemporally distinct processes corresponding to levels of linguistic computation. Among the processes identified by these higher-resolution data is grammatical computation, which has been elusive in previous, coarser-grained investigations. As such, the results are also consistent with recent proposals that Broca’s area is not dedicated to a single kind of linguistic representation but is differentiated into adjacent but distinct circuits that process phonological, grammatical, and lexical information (37, 3941).

Supporting Online Material

Materials and Methods

Figs. S1 to S6

Tables S1 and S2


References and Notes

  1. We use “Broca’s area” to denote the left IFG pars opercularis and pars triangularis [classically, Brodmann areas 44 and 45, but see (30)].
  2. Materials and methods are available as supporting material on Science Online.
  3. The context words (“a” and “to”) prevented participants from simply concatenating the cue and target (a strategy that would succeed in two-thirds of the trials) and helped equalize difficulty across conditions.
  4. Differences in the signals between regular and irregular verbs are not analyzed here [for discussion, see (19)].
  5. Frequency score was the rounded natural log of the combined frequencies of all inflectional forms of a word, plus one.
  6. These factors were largely independent. Word length correlated little with morpheme count (0.267) or frequency (–0.347).
  7. This component may approximate the P600 component often recorded from the scalp (42), but comparisons are difficult because the P600 is generally elicited by errors, in comprehension rather than production experiments.
  8. The exception was that, for nouns, the Overt-Read comparison at ~320 and the Overt-Null comparison at ~450 ms only approached significance (P = 0.08 and 0.06, respectively; one-tailed t test).
  9. We measured the average amplitude of the rectified all-conditions LFP in Broca’s area channels in all patients, in the 150- to 650-ms interval, embracing our components of interest. The response epoch had a higher amplitude than the cue epoch in most (20 of 26) channels, and across all channels was 99% greater. [Patient A yielded a higher amplitude in the response epoch in 7 of 10 channels, on average 71.7% higher; patient B in 7 of 10 channels (+33.6% on average); and patient C in 6 of 6 channels (+191.6% on average)].
  10. LFP components reported here vary by amplitude but not latency or duration; evidently, the processes they index are consistently timed, and other processes [e.g., assembly and enactment of the articulatory plan (8)] produce the differences in response latency.
  11. However, the fine-grained, within-gyrus localization reported here cannot easily be mapped onto the more macroscopic divisions suggested by these authors.
  12. Supported by NIH grants NS18741 (E.H.), NS44623 (E.H.), HD18381 (S.P.), T32-MH070328 (N.T.S.), NCRR P41-RR14075; and the Mental Illness and Neuroscience Discovery (MIND) Institute (N.T.S.), Sackler Scholars Programme in Psychobiology (N.T.S.), and Harvard Mind/Brain/Behavior Initiative (N.T.S.). We heartily thank the patients. We also thank E. Papavassiliou and J. Wu for access to their patients; S. Narayanan, N. Dehghani, M. T. Wheeler, F. Kampmann, and L. Gruber for assistance with intracranial electrophysiological data; R. Raizada for manuscript suggestions; N. M. Sahin; and two anonymous reviewers whose suggestions and encouragement greatly improved this paper.
View Abstract

Stay Connected to Science

Navigate This Article