Genetic Correlates of Musical Pitch Recognition in Humans

See allHide authors and affiliations

Science  09 Mar 2001:
Vol. 291, Issue 5510, pp. 1969-1972
DOI: 10.1126/science.291.5510.1969


We used a twin study to investigate the genetic and environmental contributions to differences in musical pitch perception abilities in humans. We administered a Distorted Tunes Test (DTT), which requires subjects to judge whether simple popular melodies contain notes with incorrect pitch, to 136 monozygotic twin pairs and 148 dizygotic twin pairs. The correlation of DTT scores between twins was estimated at 0.67 for monozygotic pairs and 0.44 for dizygotic pairs. Genetic model-fitting techniques supported an additive genetic model, with heritability estimated at 0.71 to 0.80, depending on how subjects were categorized, and with no effect of shared environment. DTT scores were only weakly correlated with measures of peripheral hearing. This suggests that variation in musical pitch recognition is primarily due to highly heritable differences in auditory functions not tested by conventional audiologic methods.

The perception of pitch requires both the ear, which receives auditory signals, and the brain, which performs substantial processing of auditory signals to produce a perceived pitch (1–3). Although the general features of human pitch processing have been well described, the precise cellular and molecular mechanisms involved remain largely obscure. One approach to understanding the mechanisms of pitch perception is to use genetic methods that exploit naturally occurring variation in pitch perception ability (4). If such variability is due to genetic factors, linkage and positional cloning studies could identify genes that encode the components of the pitch perception apparatus (5). To examine the genetic contributions to musical pitch recognition ability in humans, we performed a twin study (6) using the Distorted Tunes Test (DTT), which requires subjects to recognize notes with incorrect pitch in simple popular melodies (7).

The original DTT was developed in the 1940s (8) and used in large studies in the British population. These studies suggested that cultural biases and the effects of musical experience could be minimized by the appropriate choice of melodies, and that test scores in the same individual were stable across decades. They also revealed that a small but significant portion of the population (about 5%) scored no better than chance in their ability to distinguish correct from incorrect melodies. These individuals were classified as “tune deaf.”

For our study, we created a similar, updated DTT and validated it for use in the current U.S. and British populations (9). The updated DTT was recorded on a compact disk and presented to all subjects in the same setting. Briefly, subjects were presented with 26 short popular melodies, ranging in length from 12 to 26 notes. Tunes were presented once, and after each presentation, subjects were asked to score whether the melody was correct or incorrect, and whether they were familiar or unfamiliar with that melody. We first measured the performance of 50 unrelated males and 50 unrelated females on the updated DTT. The distribution of scores in males and females did not differ (Kolmogorov-Smirnov = 0.94, Mann-Whitney test = 0.78). Test-retest scores in the same subject were highly correlated (n = 40, r = 0.77), confirming that like the original DTT, the updated DTT is reproducible in individuals. In contrast to results obtained by Kalmus and Fry with the original DTT, we did not observe a clear distinction between tune deaf and normal individuals.

Because our goal was to determine the extent to which genes and/or environment influence musical pitch-recognition ability, we chose a twin study, which can discriminate between the effect of the shared environment and that of shared genes. The study was approved by the St. Thomas' Hospital Research Ethics Committee (EC95/041, modification approved 29 September 1999), and informed consent was obtained from all subjects. A total of 284 female Caucasian twin pairs [136 identical (monozygotic, MZ) and 148 nonidentical (dizygotic, DZ)] aged 18 to 74 years from the St. Thomas' UK Adult Twin Registry (10) participated in the study. Subjects in this registry were ascertained from the general population through national media campaigns in the United Kingdom. Participating twins were part of an ongoing study into the genetics of common complex diseases (10–13). Twins were unaware of the specific hypotheses tested and were not screened for IQ, musical training, or musical experience.

The median ages of the MZ group and the DZ group were 50.7 and 47.9 years, respectively. Zygosity was determined by standardized questionnaire, with DNA fingerprinting used for confirmation (10). Each subject was administered the DTT and the 5 Minute Hearing Test (FMHT) to help identify subjects with potentially confounding hearing loss. The FMHT, promulgated by the American Academy of Otolaryngology, has been widely used for initial screening for hearing loss, and high correlations have been reported with a wide range of hearing measures, including pure-tone audiometry (14).

Using the DTT, we measured musical pitch recognition ability on an ordinal scale, scored as the number of correctly classified tunes. The scores of the subjects ranged from 26 (a perfect score) to 9. The distribution of scores in the MZ and DZ pairs is shown in Fig. 1, along with the overall score distribution of the entire sample. Although there was a trend for the MZ twins to have lower scores, this difference was not statistically significant (chi-square, P = 0.12; Kolmogorov-Smirnov, P = 0.26).

Figure 1

Distribution of twin scores on the Distorted Tunes Test. Bars indicate the number of subjects attaining each score on the DTT. MZ, scores of monozygotic twins; DZ, scores of dizygotic twins.

The statistical package STATA (15) was used to analyze the data. Spearman correlations were used to describe the associations between the scores on the DTT and the FMHT. A slight negative correlation between the DTT and the FMHT was observed (Spearman'sr = −0.10, P = 0.01). Despite having statistical significance, this small negative correlation may or may not have functional significance. To test if this affected scores on the DTT, we divided subjects into a hearing and a hearing-impaired group on the basis of their FMHT scores (hearing impaired = FMHT score >15). No difference in DTT score was found between the groups as determined with statistical methods that account for the nonindependence of the twin pairs (generalized estimating equation,P = 0.45). There was no correlation between the DTT and age (Spearman's r = −0.01, P = 0.87). Probandwise concordance rates for the data divided into the two original categories described by Kalmus and Fry were 0.75 for MZ and 0.57 for DZ twin pairs, and a test for the difference was significant (P = 0.01) (16), indicating a genetic influence.

We applied genetic model-fitting techniques using the structural equation modeling package Mx (17) to obtain estimates of the genetic and environmental factors. Model fitting allows separation of the observed phenotypic variance into additive (A) or dominant (D) genetic components and common (C) and unique (E) environment. E also contains measurement error. The heritability, which estimates the extent to which variation in liability to disease in a population can be explained by genetic variation, can be defined as the ratio of genetic variance (A + D) to total phenotypic variance (A + C + D + E). We tested the significance of A, C, and D by removing them sequentially in specific submodels and testing the deterioration in model fit after each component was dropped from the full model. This leads to a model explaining the data with as few parameters as possible. Standard hierarchical chi-squared tests were used to select the best-fitting model (18).

The genetic and environmental contributions for categorical traits can be quantified by assuming a continuous underlying normal liability distribution with multiple thresholds that discriminate between the categories (18). We estimated the correlation in liability within the twin pairs by deriving polychoric correlations from the pairwise categorical distribution. The model-fitting approach compares the size of the polychoric correlations in MZ and DZ twins and provides estimates of the relative contribution of genetic and environmental factors to the liability distribution underlying musical pitch perception ability (18, 20). Using the original definition of Kalmus and Fry (tune deaf = a score ≤23) on our data resulted in a large percentage of tune deaf individuals (39.6%). Using the updated DTT, we also did not observe a clear distinction between tune deaf and normal individuals. We therefore also fitted the data without imposing any arbitrary cut-off points and assigned categories as the raw scores themselves, with the exception of the few lowest scorers (scores between 9 and 15) that were assigned to one category, resulting in 12 categories. Contingency tables for the 12 categories were produced for the MZ and DZ twins and used as input data for Mx.

Analysis of the complete data (with no arbitrary cut-off point) showed a correlation in liability within the twin pairs of 0.67 and 0.44 for MZ and DZ twins, respectively (see Table 1, Model 1 for twin correlations in liability on the basis of two category data).

Table 1

Twin correlations and heritabilities of best-fitting models for the DTT data with two categories (Model I) or the actual scores (Model II). rMZ, monozygotic twin correlation;rDZ, dizygotic twin correlation; h 2, heritability; 95% CI, 95% confidence interval.

View this table:

Across the analyses, the model providing the best fit to the data included an additive genetic and a unique environmental component. The heritability as estimated by genetic model-fitting with all of the available data was 71% [95% confidence interval (CI): 61 to 78%). Using the original cut-off value of Kalmus and Fry (≤23) to define two classes corresponds to a simplified model that contains only two groups: those with normal pitch recognition and those with some deficit in pitch recognition, regardless of severity. Using this model, we estimated heritability at 80% (95% CI: 65 to 90%). In both analyses, no dominant genetic effects and no significant effect of shared environment were detected (Table 1).

Despite the major role of genetic factors underlying DTT scores, a certain amount of musical experience is nevertheless required to perform well on the DTT, and the original results of Kalmus and Fry provided evidence for the effects of such experience (7,19). Any effects of culture or musical experience are likely to be the same for both MZ and DZ twins and thus should have no effect on the heritability estimates that we have found. Our data indicate that individual differences in musical experience may be at least partly responsible for the fraction of variance in DTT scores attributable to unique environment (E), 20 to 29%. Because of the DTT's requirement for musical experience, it may be a conservative measure of the heritability of variation in pitch perception in isolation, that is, measured in a way that does not require such experience.

Because the FMHT serves as a rough measure of peripheral hearing, its poor correlation with the DTT suggests that musical pitch perception is largely independent of peripheral hearing and that variation in pitch perception originates in portions of the auditory system that are independent of this function.

Melodies consist of a series of tones presented in a specific order and rhythm, in which successive tones differ by specific intervals. In the DTT, the note order and rhythm remain unchanged, and only the interval between successive tones is altered. Because variation in long-term tonal memory (9) and musical experience appear to have modest effects, the DTT primarily determines a subject's ability to measure successive pitch intervals. Although the DTT has a number of characteristics that make it ideal for a large-scale twin study, it will be important to use other measures of pitch recognition to confirm the results obtained with the DTT. Similarly, indistinguishable distributions of DTT scores in males and females suggest that results from female subjects can be generalized to the whole population, but it will be important to confirm this with male subjects.

The original DTT was designed to identify individuals with severe deficits in pitch recognition. Our results with the modified DTT indicate a high heritability for performance across the full spectrum of pitch recognition abilities in the general population. Studies elsewhere have demonstrated a significant genetic contribution to absolute pitch (AP), a relatively rare phenomenon in which individuals are capable of identifying a particular tone without the use of a reference tone. However, AP individuals do not necessarily have superior pitch acuity compared to those without AP (20). In addition, the full development of AP apparently has a strong environmental component, with specific musical training (perhaps during a critical developmental time period) being required for expression (21). Thus, AP stands in contrast to the ability measured by the DTT, and it is not clear if the genetic factors in AP have any relation to those that underlie DTT scores.

The heritability estimates we observe for this measure of deficits in pitch recognition are very substantial and are as high as or higher than those for many common complex traits in humans (22). However, these high heritability estimates do not address several important issues, including the number of genes involved and their relative effects. Our results indicate that genetic approaches, which are ideal for studying traits with unknown biochemical or cellular mechanisms, are likely to be fruitful in efforts to understand this neural function.

  • Present address: Georgia Prevention Institute, Medical College of Georgia, Building HS-1640, Augusta, GA 30912, USA.


View Abstract

Stay Connected to Science

Navigate This Article