Interactions of Multisensory Components Perceptually Rescue Túngara Frog Mating Signals

See allHide authors and affiliations

Science  19 Jul 2013:
Vol. 341, Issue 6143, pp. 273-274
DOI: 10.1126/science.1237113

Romancing the Frog

In túngara frogs, auditory and visual components of mate calling do not naturally occur together. Taylor and Ryan (p. 273, published online 6 June) now show that two signals that are unattractive to female frogs when presented alone become highly attractive when presented together. In a kind of “perceptual rescue,” the unique combination of two signals increased the receiver's interest in the previously uninteresting signals.


Sexual signals are often complex and perceived by multiple senses. How animals integrate signal components across sensory modalities can influence signal evolution. Here we show that two relatively unattractive signals that are perceived acoustically and visually can be combined in a pattern to form a signal that is attractive to female túngara frogs. Such unanticipated perceptual effects suggest that the evolution of complex signals can occur by alteration of the relationships among already-existing traits.

Human perception of stimuli in multiple sensory modalities can positively influence signal detection, selective attention, learning, and memory (1). One example is “hearing lips and seeing voices” in the McGurk effect (2), which provided the foundation for speech auditory-visual research (3). Studies of multimodal communication in animals have often asked whether individual signal components in different sensory modalities are redundant or carry different information (4), but few studies have investigated how specific interactions influence signal perception (5).

Female túngara frogs base their mate choices on male mating calls. Specifically, males produce calls consisting of a whine alone or they may add up to seven chucks; they do not produce only chucks (6). Females exhibit phonotaxis (movement toward a call, a bioassay of call recognition and preference) to a whine only, but exhibit a fivefold preference for calls with a whine-chuck over a whine only [N = 3662 (11); see also Fig. 1A]. We tested female mate preferences in a series of two-choice tests. Synthetic male vocalizations were broadcast from two speakers, one of which was paired with a robotic frog that provided the visual stimulus of a calling male. Females were released equidistant from the two speakers (with a 60° separation relative to the female release point) and allowed to choose a stimulus. Because our experimental configuration differed from those of previous experiments, we replicated some studies and obtained similar results (Fig. 1, A, B, and D). Females were tested only once.

Fig. 1

Preference responses. Each portion of the figure illustrates the acoustic components of the túngara frog mating call: a whine only [(A, B, and J), right gray], a chuck only [(J), left black], or a whine-chuck (all other calls). The natural whine-chuck is depicted in (A), left black; (C), right gray; (D to G), all acoustic signals; and (I), right gray. The rectangle represents the inflation-deflation cycle of the robofrog’s vocal sac and its temporal relationship to the call [(D) to (J), left black]. The x axis represents 1000 ms, green indicates the significantly preferred stimulus, and red indicates the unpreferred stimulus. In each of the 10 experiments [(A) to (J)], 20 females were given a choice between the signal in black versus the signal in gray. The vertical black and gray bars represent the number of females that chose the respective signal, and the blue dashed horizontal lines represent the null hypothesis of equal preference. Experiments highlighted in the solid blue box are tests of the perceptual rescue versus template-matching hypotheses, and those in the dashed blue box are the test of the component substitution hypothesis. The results of binomial tests are noted as *** = P < 0.001, ** = P < 0.01, * = P < 0.05, ns (not significant) = P > 0.05. The exact P values for each experiment are as follows: (A) P = 0.0003, (B) P = 0.744, (C) P = 0.019, (D) P = 0.034, (E) P = 0.323, (F) P = 0.0049, (G) P = 0.019, (H) P = 0.039, (I) P = 0.583, (J) P = 0.0001.

The acoustic component of a frog’s mating call is its most distinguishing feature, but visual cues are also associated with the sexual display. Male frogs have inflatable vocal sacs that shuttle air to and from the lungs while calling. Similar to the movement of lips during human speech (2), they are a biomechanical consequence of the sound production system (7), but, as with lips and speech, they can also influence the perception of the call (8, 9). We have shown previously that female túngara frogs prefer a multimodal signal (a call associated with a robotic frog) to a call by itself (10), a result reconfirmed here (Fig. 1D).

In túngara frogs, the temporal relationship between acoustic components influences the signal’s attractiveness (11). When the chuck in a whine-chuck call is displaced by 500 ms, the call becomes merely as attractive as a whine only (Fig. 1B) and less attractive than a normal whine-chuck (Fig. 1C). The temporal relationship between the acoustic and visual components of the signal also influences the signal’s attractiveness [fig. 1, E to G, from (12)]. When the acoustic and visual cues are offset by 100 ms, the visual cue no longer adds to the attractiveness of the acoustic cue (Fig. 1E). When the visual cue is displaced further in time, 200 ms after the call’s onset (Fig. 1F) or immediately after the call’s offset (Fig. 1G), the manipulated multimodal signals are both significantly less attractive than the call alone (12). Thus, displacement of the visual cue can reverse the valence of the multimodal signal (Fig. 1, F and G).

Two hypotheses may explain why the acoustic (chuck) and visual (vocal sac) cues lose salience when temporally displaced from the whine. The template-matching hypothesis predicts that females have an internal neural template of the species’ call that facilitates recognition (13); disrupting components of mating signals will disrupt their recognition by females. An alternative hypothesis, which we term perceptual rescue, posits that stimulus saliency is influenced by the relative and not the absolute relationships of signal components to one another. If so, a temporally disrupted and less attractive stimulus could be rescued by strategic association with another less attractive stimulus, causing females to bind these components into the percept of a more attractive mating signal. Perceptual rescue predicts more flexibility in signal recognition than template matching.

We tested the perceptual rescue hypothesis against the mutually exclusive template-matching hypothesis [in the sense of “strong inference” (14)]. Specifically, we asked whether placing a visual cue between two separated acoustic cues would cause the components to be bound into one coherent signal. To do this, we placed the vocal sac inflation from the relatively unattractive whine-chuck-sac (Fig. 1G, left black) into the gap between the whine and the chuck (Fig. 1B, left black). This generated a whine followed by the vocal sac inflation-deflation, which in turn was followed by a chuck 500 ms after the whine’s offset (Fig. 1H, left black). This multimodal signal was competed against the unimodal whine-gap-chuck, the same acoustic stimulus but lacking a visual stimulus between the whine and the chuck.

Fourteen of 20 females preferred the multimodal signal to the unimodal signal [mid-value P reported throughout (15, 16), P = 0.039, Fig. 1H]. Thus, adding the visual cue to the gap between the whine and the chuck, which is equivalent to adding the chuck to the end of the temporally displaced visual cue, rescued the perceptual effects of these stimuli; it made this stimulus complex more attractive. Further, the addition of the visual cue restored the signal’s attractiveness to that of a normal whine-chuck, also predicted by the perceptual rescue hypothesis (9 responses to the multimodal signal, 11 to the unimodal, P = 0.41, Fig. 1I).

These results reject the hypothesis that the negative influence of temporally displaced components on the signal’s attractiveness (Fig. 1, B and C, and E to G) was due to these stimuli not matching a conspecific call template. Rather, these data support the hypothesis that temporally disjunct stimuli can be linked into a common percept of a mating call by their strategic placement in time, even when the resulting stimulus complex has never been experienced by females in nature.

One explanation for our results is that the displaced vocal sac causes the whine and displaced chuck to be perceptually bound. An alternative, the component substitution hypothesis, posits that the vocal sac inflation substitutes for the whine, creating the context for perceiving the chuck as part of the mating signal (i.e., “whine-chuck”) and making the whine itself irrelevant. This is not the case. Females significantly preferred a whine to a vocal sac inflation followed by a chuck (sac-chuck) (P < 0.0002, Fig. 1J). These results further support our interpretation of perceptual binding.

What is the mechanistic basis of perceptual rescue? Humans can group streams of speech into perceptual units, and when sound is interrupted, a continuity illusion can be generated by introducing broadband noise into silent gaps (17). Although one study in frogs failed to find a continuity illusion in the auditory channel (18), our results are consistent with a continuity-type illusion that combines sensory modalities. Thus, one possible mechanism for perceptual rescue is that the presence of the visual cue generates a multisensory continuity illusion.

Although túngara frogs do not produce dissociated acoustic and visual signal components, these manipulations are not as ecologically irrelevant as they might seem. Females are challenged by an auditory world similar to the cocktail party problem in humans (19). At their breeding choruses, they need to perceptually bind the whine and chuck, and assignment of the two acoustic components to their sources is not always accurate (20). The cross-modal interactions we reveal here suggest that the problem might be even more challenging when auditory and visual scene analyses are combined.

These dynamic and context-dependent interactions among multimodal signal components could form a basis for signal recognition, but it would be radically different from the standard template-matching model (13). Our study suggests a need to reconsider the neural basis by which animals recognize signals, to account for these cognitive and perceptual biases that lead to the emergence of hidden preferences. Emergent properties arising from interactions among sensory modalities are not restricted to the recognition of communication signals. In chicks, odor and color can interact to generate an aversive response that does not occur with either component in isolation (21). In this domain as well, there is a psychological response that is hidden when only isolated stimuli are encountered.

Our findings also have implications for understanding receiver psychology (22) and specifically how perceptual processes can drive the evolution of complex signal phenotypes. The interactions we report could facilitate signal evolution with only a few key changes. Different components of a complex phenotype need not arise simultaneously and de novo but could result from temporal or spatial shifts in previously incoherent traits. Put another way, perceptual integration in a multisensory universe may yield emergent psychological percepts that provide the basis for positive selection of complex signals. This type of unanticipated perceptual bias could be responsible for the evolution of some of the extreme and elaborate signals that evolve under sexual selection by mate choice (23, 24).

Supplementary Materials

Materials and Methods

References and Notes

  1. Acknowledgments: We are grateful to the NSF (grants IBN 0517328, IBN 0078150, and IOS 1120031) and the Clark Hubbs Regents Professorship (M.J.R.) for funding and the Smithsonian Tropical Research Institute for logistical support. We thank E. Balaban for helpful discussion and H. Farris, S. Partan, S. Pika, G. Rosenthal, and three anonymous reviewers for their comments on the manuscript. We thank B. Klein and Moey, Incorporated for design and fabrication of the robotic frogs. We are especially grateful to the interns who assisted in data collection. All research reported here complied with IACUC protocols from Salisbury University, the University of Texas, and the Smithsonian Tropical Research Institute. We obtained all required permits from the Government of Panama. Data will be archived in the Dryad Data Repository at
View Abstract

Stay Connected to Science

Navigate This Article