Integration of Word Meaning and World Knowledge in Language Comprehension

See allHide authors and affiliations

Science  16 Apr 2004:
Vol. 304, Issue 5669, pp. 438-441
DOI: 10.1126/science.1095455


Although the sentences that we hear or read have meaning, this does not necessarily mean that they are also true. Relatively little is known about the critical brain structures for, and the relative time course of, establishing the meaning and truth of linguistic expressions. We present electroencephalogram data that show the rapid parallel integration of both semantic and world knowledge during the interpretation of a sentence. Data from functional magnetic resonance imaging revealed that the left inferior prefrontal cortex is involved in the integration of both meaning and world knowledge. Finally, oscillatory brain responses indicate that the brain keeps a record of what makes a sentence hard to interpret.

Language is used, among other things, to exchange information about the world. This entails that, during online comprehension, the meaning of a phrase or sentence is derived and, in many cases, its truth is verified. For this to be possible, usually information about the words of a language and about the facts of the world need to be retrieved from memory.

At least since Frege (1, 2), theories of meaning have made a distinction between the semantics of an expression and its truth value in relation to our mental representation of the state of affairs in the world (3, 4). For instance, the sentence “the present queen of England is divorced” has a coherent semantic interpretation, but it contains a proposition that is false in the light of our knowledge in memory that she is married to Prince Phillip. The situation is different for the sentence “the favorite palace of the present queen of England is divorced.” Under default interpretation conditions, this sentence has no coherent semantic interpretation, because the predicate is-divorced requires an animate argument. This sentence mismatches with our representation of the world in memory, because the descriptive features of the purported state of affairs are inherently in conflict. The difference between these two sentences suggests the distinction that can be made between facts of the world and facts of the words of our language, including their meaning. Although theories of semantic memory usually do not make this distinction (5), accounts of online language processing often do, and they distinguish between the retrieval and usage of world knowledge and of knowledge of word meaning.

Relative to the distinction between facts of the world and facts of the words of one's language, some aspects of word meaning might be characterized as linguistic in nature, whereas other aspects relate to world knowledge. In linguistic theory, the latter is referred to as the domain of pragmatics, and the former as the domain of semantics. Based on this distinction between semantics and pragmatics (68), the semantic interpretation of a sentence is thought to be separate from and to precede the integration of pragmatic or world knowledge information (9). However, a number of researchers (3, 10) have pointed out that the distinction between linguistic meaning and world knowledge is problematic because many words are polysemous, and their meaning can only be fully established by invoking world knowledge (11).

We decided to contribute to settling this issue by providing neurophysiological evidence on the integration of semantic and world knowledge information. The underlying idea is that if a principled distinction can be made between linguistic meaning and world knowledge, concomitant processing differences should be obtained during the interpretation of a sentence.

This study presents electroencephalogram (EEG) and functional magnetic resonance imaging (fMRI) data that speak to this issue. While participants' brain activity was recorded, they read three versions of sentences such as “the Dutch trains are yellow/white/sour and very crowded” (the critical words are in italics). It is a well-known fact among Dutch people that Dutch trains are yellow, and therefore the first version of this sentence is correctly understood as true. However, the linguistic meaning of the alternative color term white applies equally well to trains as the predicate yellow. It is world knowledge about trains in Holland that makes the second version of this sentence false. This is different for the third version. This version contains a violation of semantic constraints. The core meaning of sour is related to taste and food. Under standard interpretation conditions, a predicate requires an argument whose semantic features match that of its predicate. For our third sentence, this is clearly not the case, because semantic features related to taste and food do not apply to trains. One could thus argue that for semantic internal reasons, the third sentence is false or incoherent (3, 12). It is our knowledge about the words of our language and their linguistic meaning that pose a problem for the interpretation of the third version of this sentence.

The increased interpretation load of semantic and world knowledge violations is assumed to have an effect on electrophysiological brain activity and on the hemodynamic response. If semantic interpretation precedes verification against world knowledge, the effects of the semantic violations should be earlier and might invoke other brain areas than the effects of the world knowledge violations.

Based on EEG recordings from 29 electrode sites (13, 14), event-related brain potentials (ERPs) were computed and time-locked to the onset of the critical words that embodied the semantic violation, the world knowledge violation, and their correct counterpart. ERPs reflect the summation of the postsynaptic potentials of a large ensemble of synchronously active neurons. They provide a sampling of the brain's electrical activity with a very high temporal resolution. We focused on one particular ERP effect, referred to as the N400. The amplitude of this negative-going ERP between roughly 250 and 550 ms, with a maximum at ∼400 ms, is influenced by the processing of semantic information (15). The easier the match between the lexical semantics of a particular content word and the semantic specification of the context, the more reduced the N400 amplitude will be. The N400 is known to be very sensitive to semantic integration processes (16, 17).

As expected, the classic N400 effect was obtained for the semantic violations. For the world knowledge violations, we also observed a clear N400 effect. Crucially, this effect was identical in onset and peak latency and was very similar in amplitude and topographic distribution to the semantic N400 effect (Fig. 1). This finding is strong empirical evidence that lexical semantic knowledge and general world knowledge are both integrated in the same time frame during sentence interpretation, starting at ∼300 ms after word onset.

Fig. 1.

Grand average (n = 30 subjects) ERPs for a representative electrode site (Cz) for the correct condition (black line), world knowledge violation (blue dotted line), and semantic violation (red dashed line). ERPs were time-locked to the presentation of the critical words (underlined). Semantic violations resulted in a larger N400 amplitude between 300 and 550 ms than the control condition [F(1,29) = 73.0, P < 0.0001]. N400 amplitudes to world knowledge violations were also larger than for correct controls [F(1,29) = 27.4, P < 0.0001]. The size of the effect was slightly larger for semantic violations than for world knowledge violations (P < 0.05). The onset of the effects for semantic and world knowledge violations was not significantly different (14). Spline-interpolated isovoltage maps display the topographic distributions of the mean differences from 300 to 550 ms between semantic violation and control (left) and between world knowledge violation and control (right). Topographic distributions of the N400 effect were not significantly different between semantic and world knowledge violations (P = 0.9).

In addition, we used the EEG data to investigate oscillatory brain activity in a wide frequency range (1 to 70 Hz) in relation to the semantic and world knowledge violations. Amplitude increases of EEG oscillations in specific frequency bands, such as theta (4 to 7 Hz) and gamma (∼30 to 70 Hz), that are induced by a cognitive event are thought to reflect the dynamic recruitment of the relevant neuronal networks engaged in cognitive processing (18). A wavelet-based time-frequency representation (TFR) of EEG power changes (Fig. 2) revealed a clear gamma peak for the world knowledge violation that was not seen for the semantic violation (19). In contrast, relative to the other conditions, the semantic violation resulted in an increase in power in the theta frequency range. Both effects are visible within the latency range of the N400. Especially at lower frequencies (e.g., at theta frequencies), the temporal resolution of the wavelet transform is relatively poor. This implies that the relative onset difference between theta and gamma activity cannot be taken as a reliable indicator of onset differences in the underlying neurophysiological events. In particular, the latency of the gamma peak seems to be later than the onset and the peak of the N400 ERP response. This suggests that the oscillatory brain responses reflect post-integration processes.

Fig. 2.

Morlet wavelet–based TFRs of the induced activity for the three experimental conditions. The x axis represents time; the y axis, frequency. Boxes within the TFRs for a representative electrode (Fz) indicate the areas in which clear gamma and theta responses were observed in a latency range overlapping with the N400. Left of the TFRs are the topographic maps of the gamma activity within the indicated area. On the right, scalp topographies are shown for the theta response in the indicated area. The world knowledge violation showed a peak in the gamma frequency range with a frontocentral distribution, with significant increase in gamma power for eight electrodes (t test, two-tailed, P < 0.05). No increases in gamma power were seen in the other conditions. All conditions showed a significant theta response. The theta response was significantly stronger for the semantic violation than for both other conditions [F(1,29) = 4.2, P < 0.05], based on an analysis with left and right temporoparietal and frontal midline electrode sites as the regions of interest. In addition, only the semantic violation showed a frontal midline theta increase, as was found in a peak-surround analysis for midline electrode Fz [F(1,29) = 11.2, P < 0.005].

Gamma oscillations have been associated with feature binding within and across modalities (20, 21). These oscillations are suggested to play an important role in the integration of activity in both local and distributed neural networks (18). It has been suggested that activity in the theta band might involve contributions from the hippocampal complex with, presumably, additional cortical contributions (22). Theta activity has been observed in relation to both episodic and working memory tasks (23). Although the integration of word meaning and world knowledge occurs rapidly and simultaneously, the different oscillatory responses for semantic and world knowledge violations indicate that the brain seems to keep a record of the nature of the integration problem.

Finally, we ran an event-related fMRI experiment using the same materials (24). The fMRI data, time-locked to the onset of the critical words, revealed an increase in activation in the left inferior prefrontal cortex (LIPC), as compared to correct sentences, that was common to both semantic and world knowledge violations (Fig. 3). The activation was observed in, or in the vicinity of, Brodmann's areas (BAs) 45 and 47. This region has previously been reported in connection to semantic processing (25, 26). This region is also known to contain one of the N400 generators, based on evidence from intracranial electrical and magnetoencephalogram recordings (27, 28). Our study provides the first evidence for a role of the LIPC in the integration of world knowledge that is represented in long-term memory, next to the integration of lexical semantic knowledge.

Fig. 3.

The common activation for semantic and world knowledge violations compared to the correct condition, based on the results of a minimum T-field conjunction analysis (supporting online text). Both violations resulted in a single common activation (P = 0.043, corrected) in the left inferior frontal gyrus, in or in the vicinity of BA 45 (with coordinates [x, y, z] = [–44, 30, 8] and Z = 4.87) and BA 47 ([x, y, z] = [–48, 28, –12], Z = 4.15). The cross-hair indicates the voxel of maximal activation and has coordinates [x, y, z] = [–44, 30, 8] (left BA 45). L, left hemisphere; R, right hemisphere.

Both word meaning and world knowledge are recruited and integrated very rapidly, within some 400 ms, during online sentence comprehension. The LIPC seems to be critical both in the computation of meaning and in the verification of linguistic expressions. Although Frege (1) made an important distinction between the sense of a proposition and its reference, their processing consequences appear to be immediate and parallel. Our results provide evidence against a nonoverlapping two-step interpretation procedure in which first the meaning of a sentence is determined, and only then is its meaning verified in relation to our knowledge of the world. This is compatible with findings of the immediate influence of visual information (29) and of the preceding discourse on the interpretation of phrases and sentences (30). Semantic interpretation is not separate from its integration with nonlinguistic elements of meaning.

In conclusion, while reading a sentence, the brain retrieves and integrates word meanings and world knowledge at the same time. The LIPC is a crucial area for this integration process. Moreover, it does not take longer to discover that a sentence is untrue than to detect that it is semantically anomalous. However, the oscillatory brain responses suggest that the brain keeps a record of what makes a sentence hard to interpret, whether this is word meaning or world knowledge.

Supporting Online Material

Materials and Methods

Fig. S1

References and Notes

References and Notes

View Abstract

Navigate This Article