Grainger et al. (Reports, 13 April 2012, p. 245) suggest that baboons can discriminate words from nonwords on the basis of two-letter (bigram) frequencies. This ability can also be attributed to baboons being able to recognize specific letters (i.e., shapes) in specific positions in their four-letter words, without reference to letter or bigram frequencies.
Grainger et al. (1) present remarkable data on the ability of baboons to learn characteristic patterns in abstract symbols. They and the accompanying commentary by Platt and Adams (2) conclude that the animals are learning to discriminate words from nonwords based on the frequency of two-letter combinations (bigrams), which are highly nonrandom in real English text, as opposed to random letter strings. This is proposed as an underlying skill enabling human reading and that this underlying ability is present in baboons.
There are two minor methodological problems with Grainger et al.’s study which suggest that, while the learning abilities demonstrated by baboons are impressive, they may be less closely related to the ability of Homo sapiens to read than is suggested by Grainger et al.
First, a small percentage of the words classified in (1) as “nonwords” were actually English words. For example, some words with common usage (like “blog”), recent use but no longer common (like “blub”), archaic but still valid (like “bawd”), or obscure or technical (like “bosc”) were included in the “nonwords” list. Overall, the list of 7830 “nonwords” contains between 149 and 279 words (depending on whether obscure words are counted or not). The behavioral data showed that the animals could not distinguish genuine nonwords from actual words that were included in the nonword list (70.0% accuracy in classifying genuine nonwords as nonwords; 69.8% of words in the nonword list classified as nonwords).
Second, and more important, the same learning performance can be obtained without any reference to letter or bigram frequency. Grainger et al. state, “Bigram frequency was minimized in the list of nonwords and maximized in the list of words, so that the word versus nonword discrimination could be made implicitly on the basis of statistical dependencies between letters.” Discrimination can be made in this way, but there are other ways to discriminate between Grainger et al.’s words and nonwords. To illustrate this, I implemented a simple learning algorithm based on letter positions in a tetragram. A scoring functionwas calculated, where li is the count of the letter l present at position i in a tetragram, and wli is the weight given to letter l at position i in the tetragram. wli could be –1, 0, or +1. Scores of >0 mean that the tetragram is a word, scores of <0 mean that the tetragram is a nonword, and scores of 0 were randomly assigned to word or nonword. At each learning step, 5% of the weights were chose at random and changed to –1, 0, or +1, again randomly. A “fitness score” F was calculated for the new weight set applied to the training setwhere m(X) is the modulus of X, Swords and Snonwords are the scores for words and nonwords as above, and Cw is the number of weights that are not zero. The algorithm iteratively selects new weighting sets that have a higher F than any previous set. Implementing this in Excel resulted in a rapid convergence on an algorithm that could predict whether a word was part of the “word” or “nonword” lists (3). Between 40 and 60 letter+position combinations are needed. Figure 1 shows a typical learning curve for the system, and a typical rule.
The appearance of learning bigram frequencies was generated by the system because of the structured nature of English spelling. Following Grainger et al.’s supplementary materials, I calculated the sum of the bigram occurences in this data and summed these frequencies for each word. The summed bigram frequencies were averaged based on whether they were for incorrectly or correctly classified words or nonwords. The averages for words and nonwords were highly significantly different (Fig. 2). The words that the system got wrong had bigram frequencies intermediate between words and nonwords.
This could be interpreted as clear evidence that the system was classifying on the basis of bigram frequencies, so it found the words with intermediate bigram frequencies confusing. In fact, as described above, no bigram frequency information was used in the classification algorithm
None of this detracts from the value of Grainger et al.’s work in showing that baboons can classify groups of arbitrary abstract characters, which is an important result in these nonverbal animals. This visually based capacity may well be the same as that underlying elements of human orthographic processing. However, it is not necessarily based on letter frequency or letter pattern recognition, and so may be less closely related to our ability to discriminate likely real words from nonwords than Grainger et al. and Platt and Adams imply.