Dengue diversity across spatial and temporal scales: Local structure and the effect of host population size

See allHide authors and affiliations

Science  24 Mar 2017:
Vol. 355, Issue 6331, pp. 1302-1306
DOI: 10.1126/science.aaj9384

Estimating transmission chains for dengue

Dengue virus (DENV) causes a large number of asymptomatic infections, so surveillance captures only a fraction of cases. Salje et al. developed a method for identifying the number of transmission chains of DENV from sequence data and serology. They found that sequential transmission of DENV typically occurs between households in the same neighborhood. Within high-density urban localities, such as Bangkok, there are surprisingly few transmission chains. This results in epidemic spikes within a regional background of endemicity. Large urban settings may thus act as a source of diverse viruses that can be transported elsewhere.

Science, this issue p. 1302


A fundamental mystery for dengue and other infectious pathogens is how observed patterns of cases relate to actual chains of individual transmission events. These pathways are intimately tied to the mechanisms by which strains interact and compete across spatial scales. Phylogeographic methods have been used to characterize pathogen dispersal at global and regional scales but have yielded few insights into the local spatiotemporal structure of endemic transmission. Using geolocated genotype (800 cases) and serotype (17,291 cases) data, we show that in Bangkok, Thailand, 60% of dengue cases living <200 meters apart come from the same transmission chain, as opposed to 3% of cases separated by 1 to 5 kilometers. At distances <200 meters from a case (encompassing an average of 1300 people in Bangkok), the effective number of chains is 1.7. This number rises by a factor of 7 for each 10-fold increase in the population of the “enclosed” region. This trend is observed regardless of whether population density or area increases, though increases in density over 7000 people per square kilometer do not lead to additional chains. Within Thailand these chains quickly mix, and by the next dengue season viral lineages are no longer highly spatially structured within the country. In contrast, viral flow to neighboring countries is limited. These findings are consistent with local, density-dependent transmission and implicate densely populated communities as key sources of viral diversity, with home location the focal point of transmission. These findings have important implications for targeted vector control and active surveillance.

Microscale transmission dynamics and the resulting competitive interactions between strains drive the distribution of infectious diseases in populations. In the past, phylogeographic methods have been used to characterize pathogen dispersal at both regional and global scales, but these methods have provided few insights into the local spatiotemporal structure of endemic transmission (16). Dengue virus is a mosquito-transmitted flavivirus grouped into four serotypes (DENV1 to DENV4). Dengue viruses infect more than 300 million people annually, cause more than 20,000 deaths, and have circulated in Southeast Asia for decades (7, 8). Dengue’s main vector, Aedes aegypti, has a limited flight range and often remains within the same household for long periods (9). Dispersal is driven by the complex interplay of the abundance of both vectors and humans, their movement, and population immunity (1012). The spatial scale of dispersal of dengue viruses may dictate the success of local control efforts. The introduction of novel variants immunity to populations has been a prime determinant of burden for both dengue and other viruses (1316); thus, it is critical to understand the processes that dictate viral dispersal. The mystery of how dispersal proceeds requires linking patterns of disease incidence to actual pathways of transmission. Pathways of transmission are, in part, characterized by the number of independent transmission chains circulating in an area. For example, the number of introduced versus locally acquired malaria cases has been used as a metric of endemicity (17).

Characterizing these pathways is a particular challenge in the study of endemic pathogens, for which overlapping transmission chains result in many unrelated cases appearing in the same communities at the same time, complicating efforts to understand how a pathogen is propagated and maintained. Surveillance systems typically capture a small fraction of infections (12, 18, 19): It has been estimated that only 12% of symptomatic dengue infections are captured in Thailand, and up to three-quarters of infections are asymptomatic (19, 20). In such settings, phylogenetic approaches can reveal information about the number of circulating chains and the relationship between chains at different spatiotemporal scales and may allow us to investigate key unanswered questions in the epidemiology of endemic pathogens.

We sequenced the viruses of 640 geolocated dengue infections that occurred from 1994 to 2010 in Bangkok and five other locations throughout Thailand (Fig. 1, A and B, figs. S1 to S4, tables S1 to S3) and then combined these with 160 GenBank sequences from elsewhere in Southeast Asia. In addition, we geolocated 17,291 hospitalized cases of dengue where the infecting serotype was known. Cases from Bangkok came from a children’s hospital; patients had a median age of 8 years [interquartile range (IQR): 5 to 11 years]. The cases from outside Bangkok came from tertiary care hospitals; patients had a median age of 10 years (IQR: 7 to 13 years). Approximately half of the patients were female in both settings (table S4). We developed two separate methods to estimate the number of circulating transmission chains (i.e., cases separated by a low number of intervening transmission events) at different spatial scales using sequence data and serotype data. Here we evaluate the effect of individual characteristics (such as age and sex) on the probability of observing cases from the same chain around the residence of a case (21). In addition, we demonstrate how local population density plays a critical role in dictating the number of locally circulating chains and, using microsimulation models, recreate the observed patterns. Finally, we describe dengue’s spread across spatial scales (neighborhood, city, national, and regional), both within a season and across seasons (we consider two cases to come from the same season if symptom onset for one case occurred within 6 months of the other).

Fig. 1 Distribution of cases.

(A) Map of Thailand showing locations of case data (P, Pathum Thani; R, Ratchaburi; H, Hat Yai; L, Lampang; N, Nakhon Ratchesima; B, Bangkok). S, serotype data available; G, genotype data available. MOPH, Ministry of Public Health. (B) Geolocated case data from Bangkok province. In total, there were 7511 DENV1, 4265 DENV2, 3371 DENV3, and 2144 DENV4 cases. (C to F) Maximum credibility clade trees for DENV1 (N = 306 cases), DENV2 (N = 210 cases), DENV3 (N = 157 cases), and DENV4 (N = 127 cases). The colors of the tips represent the source of the virus. BKK, Bangkok. (G) Median spatial distance between virus pairs from Bangkok separated by different total evolutionary times. The shaded area represents 95% CIs. The number of transmission generations separating virus pairs (top axis) is calculated by dividing the total evolutionary time by 20 days, the mean generation time for dengue. The dashed line indicates the median distance between all cases in Bangkok, irrespective of evolutionary relationship.

To determine the evolutionary time between each pair of viruses, we built serotype-specific time-resolved Bayesian phylogenetic trees (Fig. 1, C to F). We used a combination of bootstrapping observations and sampling trees from the posterior to capture sampling and tree uncertainty. Within Bangkok, we find a strong linear relationship between the evolutionary time between viruses and the spatial distance separating the homes of the cases from which they were isolated for up to 1.5 years (<27 transmission events) of evolutionary separation (Fig. 1G). The median spatial distance between pairs of cases separated by <6 months (<9 transmission events) was 670 m [95% confidence interval (CI): 560 to 1250 m].

Lineages appear to persist in the local vicinity of a case for up to 6 months. Homotypic (i.e., caused by the same serotype) cases with symptom onset occurring within the same season and living within 200 m of each other have an 82% chance of having a most recent common ancestor (MRCA) in the prior 6 months (versus 46% in the prior 3 months and 7% in the prior 6 to 24 months) (fig. S5). Cases separated by more than 2 km have a 1% chance of having a MRCA in the prior 6 months (versus 0.4% in the prior 3 months and 6% in the prior 6 to 24 months). We therefore consider pairs of cases with symptom onset within the same season to be from the same transmission chain if their MRCA was within 6 months of the case with the earlier onset.

We find that 60% (95% CI: 33 to 73%) of case pairs separated by <200 m in Bangkok were from the same transmission chain, regardless of serotype. This decreases to 19% (95% CI: 11 to 26%) for those <1 km apart (Fig. 2A). These results are robust to broader definitions of what constitutes a transmission chain (i.e., using different MRCA cutoffs) (fig. S6). The rapid, distance-associated decrease in the probability of being part of the same chain provides evidence for focal transmission—that is, sequential transmissions typically occur between households in the same neighborhood (2224). This is consistent with empirical measurements of human movement having shown that people spend most of their time within a few kilometers of their homes and the limited flight range of the vector (9, 2527). Although some infection events certainly occur far from a person’s home, the tight relationship between genetic and spatial distances suggests that the majority of infection events occur near the home. This is further supported by a significant relation with age, with the youngest, and presumably least mobile, individuals (those aged <5 years) having a 30% greater probability than older children (>10 years old) of being from the same chain as cases <500 m from their home (95% CI: 16 to 41%) (Fig. 2D). Compared with males, females were also slightly more likely to share a transmission chain with those nearby (Fig. 2E). The apparent focal nature of transmission elucidates the mechanism by which increased susceptibility to severe disease after future infection with heterotypic serotypes might cluster spatially (23). If vaccination functions like a single dengue infection, the spatial scale of likely “priming” infections could tell us where individuals who would benefit from vaccination are most likely to be located (though operationalizing such a strategy may be impractical) (28).

Fig. 2 Spatial relationship between cases.

(A) Proportion of case pairs with patients falling sick within 6 months of each other and coming from the same transmission chain when separated by different spatial distances within Bangkok. The estimates are calculated using either serotype (solid squares) or genotype (open circles) data (21). Error bars represent 95% CIs. (B) Proportion of homotypic (i.e., caused by the same serotype) case pairs with patients becoming ill within 6 months of each other at different distance ranges, where both cases are in Bangkok (blue) or when one is in Bangkok and the other is in another province (purple). Error bars represent 95% CIs. (C) Proportion of homotypic case pairs where both come from the same province (upward-facing triangles) and where they come from different provinces (downward-facing triangles). Blue, Bangkok; red, provinces outside Bangkok. Letters on the x axis represent the provinces from Fig. 1. Error bars denote 95% CIs. (D) Difference in the probability that a case aged either <5 or 5 to 10 years shares the same chain as another case within different spatial distances of their home versus the probability that a case that is aged >10 years shares the same chain within that same distance. The shaded area represents 95% CIs. (E) Difference in the probability that a female case shares the same chain as another case within different spatial distances versus the probability that a male case shares the same chain within that same distance. The shaded area represents 95% CIs.

To extend our methods to settings where sequence data are unavailable, we developed an approach using only serotype data to independently estimate the probability of pairs of cases being from the same transmission chain. We calculate the probability of cases being from the same chain as the excess probability of two cases occurring within some distance of each other during the same season being homotypic compared with the probability of two unrelated cases being homotypic (Fig. 2, A and B). Cases within a season are assumed to be unrelated if they are separated by >10 km, the distance over which the probability of being homotypic remains constant (Fig. 2C and fig. S7). The results of the serotype-based analysis were nearly identical to those of the sequence-based analysis (Fig. 2, A and B).

We define the reciprocal of the probability that a pair of cases are from the same chain within a particular spatial distance as the effective number of transmission chains circulating within that distance. The effective number of chains represents a theoretical measure of the size of the pool of chains that any pair of cases within a given distance of each other are drawing from. For a sufficiently large population, this is a lower limit on the true number of chains within a particular distance of a case (21). In Bangkok, a mean of 1300 people live within 200 m of a case, and we find that, on average, 1.7 chains (95% CI: 1.4 to 3.0 chains) circulate in this population within a season. There is a linear relation between the logarithm of population size and the logarithm of the effective number of chains (Fig. 3A), with some deviation at small population sizes. In all of Bangkok, we estimate that 160 chains (95% CI: 120 to 230 chains) circulate within a season. Although a similar linear relation exists between the logarithm of population size and the logarithm of the effective number of chains, we find that provinces outside of Bangkok host fewer chains during a season. A subset of Bangkok with population size equal to an outlying province will host 5.6-fold more chains (Fig. 3A). This suggests that we will see an increase in the number of chains as rural communities become more connected.

Fig. 3 Transmission chains.

(A) Number of discrete transmission chains circulating within a 6-month period for different mean population sizes within Bangkok (blue) and across other provinces (red), calculated using either serotype or genotype data. Each intra-Bangkok estimate is the mean number of chains for different distances between cases (top axis). The mean population surrounding a case at that distance is on the bottom axis. Letters represent the provinces from Fig. 1. Error bars denote 95% CIs. (B) Number of transmission chains for fixed areas [radius of 0.5 km (green), 1 km (pale blue), or 1.5 km (yellow)] with different population sizes. Blue squares represent the mean number of chains within a fixed area across all population sizes from (A). (C) Number of chains at different population densities (in number × 103 per square kilometer) relative to the expected number of chains, regardless of population density for different sized areas. The shaded area represents 95% CIs for an area with a 1.5-km radius. The dashed line indicates a relative risk of 1.0. (D) Number of transmission chains for different-sized areas from simulations of density-dependent endemic transmission in a spatially heterogeneous population of 500,000 individuals where transmission occurs at <50 m and is two times greater in the densest areas (population density of >20,000 individuals per square kilometer) versus elsewhere.

There is substantial heterogeneity in the population density across Bangkok (fig. S8). We hypothesized that the number of chains within any location depends solely on the size of the local population, such that for areas of equal size, increasing population density results in additional chains. This hypothesis appears to hold up to a point. In Bangkok, at densities less than 7000 people per square kilometer, the number of chains circulating in a population of a given size is the same regardless of the size of the area in which they live (Fig. 3, B and C). However, at population densities above 7000 individuals per square kilometer, the number of chains ceases to increase with population size. This is consistent with microsimulation models of disease transmission that include local density-dependent transmission (i.e., increased transmissibility in denser areas) but not simulations with spatially random or local density-independent transmission (Fig. 3D and fig. S9). On the basis of results from our simple modeling framework, we postulate that ecological interactions between virus strains in denser areas may limit the number of circulating strains. This could occur through competition for hosts mediated by immunity from previous infections or other mechanisms (e.g., vector avoidance after infection). In our density-dependent simulations, previously infected individuals were 10 times more likely to become reexposed in densely populated areas versus elsewhere (fig. S10). Competition between strains mediated through attempts to infect the same host or from strain-specific immunity from previous infections suggests that evolutionary pressures are likely to be strongest in these areas. High levels of asymptomatic disease and spatial heterogeneity in health care–seeking behavior indicate that we will only ever observe a small proportion of infections and that the proportion we observe may differ geographically (29). We use this modeling framework to demonstrate that our findings are robust to underreporting and spatially-biased sampling (fig. S11). Further, we show that simulations cease to have spatial clustering, which is consistent with our observations when a relatively small proportion (>10%) of infections occur away from home (fig. S12), strengthening the case for highly focal dengue transmission in and around homes.

To better understand the broader geospatial dynamics of dengue, we compare the relative risk of infecting strains sharing a MRCA at specific time intervals for increasing spatial scales, from within Bangkok to across Southeast Asia. By only considering virus pairs isolated within a short time (<6 months) of each other in specific locations, we minimize the impact of spatial and temporal sampling biases that can affect phylogeographic analyses (figs. S13 and S14) (30). Bangkok viruses isolated from individuals living <500 m apart were 99 times (95% CI: 41 to 293) more likely to share a MRCA within 6 months of the earlier case compared with two Bangkok viruses isolated from cases >10 km apart (distal Bangkok viruses) (Fig. 4A). The probability of having a recent MRCA drops sharply as the spatial distance between Bangkok viruses increases. Viral diversity is reduced outside the capital, with virus pairs sharing an outlying province being 19 times (95% CI: 2 to 73) more likely to have a recent MRCA than distal Bangkok viruses. Virus pairs with one case located in Bangkok and the other in another province are 0.3 times as likely to have a recent MRCA compared with Bangkok distal viruses (95% CI: 0.1 to 1.4). However, after just a single season (i.e., MRCA between 6 to 24 months) these ratios approach one, which suggests that viral lineages are well mixed across Thailand, both within Bangkok and between provinces (Fig. 4B). The flow of virus across Thailand’s borders appears to be much more limited. Our data set contains no virus pairs with a MRCA within 6 months of the earlier case when one virus comes from Thailand and another from elsewhere in Southeast Asia (i.e., Vietnam, Cambodia, Malaysia, Myanmar, or Singapore). Even the probability of having a MRCA 2 to 5 years earlier for Bangkok–Southeast Asia pairs is only 0.11 times that for distal Bangkok viruses (95% CI: 0.02 to 0.4). Overall, virus pairs were as likely to be from across countries (Bangkok–Southeast Asia pairs) as from distal parts of Bangkok only when separated by more than 8 years of evolutionary time (fig. S15) (findings are similar when using viruses from throughout Thailand; see fig. S16). These findings provide strong evidence that Thailand has endemic transmission that has limited connection to the rest of Southeast Asia. Thai borders may not be sufficiently porous to facilitate easy viral movement. In addition, sick individuals may be less likely to travel internationally. Recent work has demonstrated high correlation in dengue incidence throughout the Southeast Asian region, with peaks during extreme climate years (31). Our findings support that ecological and environmental similarity, rather than viral flow, determines this synchrony.

Fig. 4 Relative risk that a pair of viruses have a MRCA within a defined period.

Each point represents the risk that a pair of viruses isolated from particular cases (i.e., those with patients that fell ill within 6 months of each other and live a particular spatial distance apart) have a MRCA within a defined evolutionary timeframe (g1-g2) relative to the risk that a pair of distal Bangkok cases (defined as two cases from Bangkok separated by >10 km) have a MRCA in the same g1-g2 range. Each panel represents a different g1-g2 range: (A) MRCA <6 months (i.e., g1 = 0, g2 = 6 months), (B) MRCA 6 months to 2 years, (C) MRCA 2 to 5 years, and (D) MRCA 5 to 10 years. “Intra-prov” refers to pairs of viruses that both come from the same province outside Bangkok. “Inter-prov” refers to cases where one virus comes from Bangkok and the other from a different province. “SE Asia” indicates when one virus is from Bangkok and the other from another country in mainland SE Asia (Vietnam, Malaysia, Singapore, Cambodia, or Myanmar). Error bars represent 95% CIs. Solid squares, Bangkok viruses; solid circles, Thai viruses from provinces outside Bangkok; solid triangles, Southeast Asian viruses outside Thailand. The open triangle in (A) represents a value of 0.

By linking the distribution of case occurrence to the biological and ecological processes (transmission chains and competition) from which the cases arise, our work moves beyond previous findings that showed spatial clustering in dengue cases (23, 24, 32, 33). We also draw connections between small-scale patterns and larger trends in dengue dispersal across the region. Further, we illustrate two independent, robust methods for revealing the spatial structure of transmission for endemic disease. Our finding that viral diversity increases with host population density supports a role for large urban settings as sources of a diverse set of viruses that could be dispersed elsewhere. The saturation of diversity at high host densities suggests that these dense areas may also be regions of intense competition between viruses, possibly contributing disproportionately to viral evolutionary pressures. For pathogen systems with multiple strains, our approach to estimating the number of circulating transmission chains within a community may provide a key surveillance tool for detecting changes in diversity accompanying expansion of particular types, characterizing differences in fitness between lineages, or identifying populations that act as sources of viruses to other populations. Insights into microscale structure of endemic transmission are important for building policy-relevant models of pathogen spread to appropriately target interventions. Our results illustrate that the structure of transmission is consistent with only certain assumptions about dengue transmission and provide empirical evidence of the importance of home location in dengue risk, supporting a role for targeted vector control around the residences of detected cases.

Supplementary Materials

Materials and Methods

Figs. S1 to S16

Tables S1 to S4

References (3443)

References and Notes

  1. Materials and methods are available as supplementary materials.
Acknowledgments: We would like to recognize funding from National Institute of Allergy and Infectious Diseases (grants R01 AI102939-01A1 and R01AI114703-01); the National Science NSF (grant BCS-1202983); and the Global Emerging Infections Surveillance and Response System (GEIS), a Division of the Armed Forces Health Surveillance Center. The funding bodies did not participate in the design of the study; the collection, analysis, and interpretation of the data; or the writing of the manuscript. This study was approved by the ethical review boards of Queen Sirikit National Institute of Child Health, Walter Reed Army Institute of Research, and Johns Hopkins Bloomberg School of Public Health. Dates of illness and infecting serotype data were obtained from the results of standard confirmatory testing for dengue and therefore did not require informed consent. Ethical approval was obtained for identifying the location of case homes. Because this is considered personally identifiable information, individuals interested in gaining access to this data should contact H.S. to organize obtaining ethical clearance. R Code for the analyses is available on GitHub ( GenBank accession numbers for the newly sequenced viruses are KY586306 to KY586946. Alignments and phylogenetic trees are available on TreeBase ( Disclaimer: Material has been reviewed by the Walter Reed Army Institute of Research. There is no objection to its presentation and/or publication. The opinions or assertions contained herein are the private views of the authors and are not to be construed as official or as reflecting true views of the Department of the Army or the Department of Defense.

Stay Connected to Science

Navigate This Article