Social genes are selection hotspots in kin groups of a soil microbe

See allHide authors and affiliations

Science  22 Mar 2019:
Vol. 363, Issue 6433, pp. 1342-1345
DOI: 10.1126/science.aar4416

Swarming in parallel toward sociality

The evolution of social behavior, and specifically of multicellularity, is poorly understood. An experimental model for multicellularity is the myxobacteria, which swarm and cooperate to form fruiting bodies in soil. Wielgoss et al. studied lineages of wild-caught myxobacteria. They found diversity, but also surprising genetic similarity, within fruiting bodies that was unlikely to be based on shared ancestry between them. Instead, reoccurrence of the same mutations seems to have occurred independently. These mutations have then been selected to confer similar phenotypes that converge on social behavior.

Science, this issue p. 1342


The composition of cooperative systems, including animal societies, organismal bodies, and microbial groups, reflects their past and shapes their future evolution. However, genomic diversity within many multiunit systems remains uncharacterized, limiting our ability to understand and compare their evolutionary character. We have analyzed genomic and social-phenotype variation among 120 natural isolates of the cooperative bacterium Myxococcus xanthus derived from six multicellular fruiting bodies. Each fruiting body was composed of multiple lineages radiating from a unique recent ancestor. Genomic evolution was concentrated in selection hotspots associated with evolutionary change in social phenotypes. Synonymous mutations indicated that kin lineages within the same fruiting body often first diverged from a common ancestor more than 100 generations ago. Thus, selection appears to promote endemic diversification of kin lineages that remain together over long histories of local interaction, thereby potentiating social coevolution.

The collective output of cooperative systems is often greatly affected by behavioral variation among lower-level units, such as among interconnected cells in organic tissues or among whole organisms within social groups (1). Such effects can be detrimental; for example, when organisms succumb to cancerous cell growth (2) or microbial populations collapse as a result of social cheating (3). Within-system variation can also have positive effects; for example, in generating mutual benefits from division of labor (4). To fully understand the evolution of cooperation and conflict in multiunit living systems, it is essential to characterize genetic determinants of lower-level variation, including genomic targets of selection (5), as well as the spatiotemporal potential for coevolution among lower-level lineages (6).

We have analyzed the genetic basis of diversity within distinct social groups of the bacterium Myxococcus xanthus. This organism is suitable for such analysis because distinct groups of cooperating individuals form multicellular, spore-bearing fruiting bodies in soil, which can be isolated with relative ease (7). Cells of this predatory species use two motility systems to cooperatively swarm across surfaces in search of microbial prey (3). In natural populations, genetic relatedness, a key parameter of kin-selection theory (8), and patterns of kin discrimination are highly structured, suggesting that most cooperative interactions occur between very closely related cells and that migration among genetically differentiated social groups is limited (7, 911). Nonetheless, substantial diversity of social phenotypes has been documented among closely related clones derived from the same fruiting body (7).

We examined genomic variation, as well as how this variation relates to social phenotypes, among 20 clones independently isolated from each of six fruiting bodies collected from natural soil samples (fig. S1). Soil from three undisturbed woodland sites near Bloomington, Indiana, United States [Greg’s House (GH), Kent Farm (KF), and Moores Creek (MC)] yielded M. xanthus fruiting bodies during laboratory incubation, from which individual fruiting bodies were isolated and heated to eliminate nonspores. Entire surviving spore populations were germinated and grown to high density before frozen storage, clone isolation (7), phenotypic characterization (7, 12), and genome sequencing of clonal samples. Because some growth occurred during the isolation procedure, we performed isolation controls on sterile soil beginning with single clones to estimate how much genetic variation in our sequencing dataset might have arisen during isolation (13).

Each set of fruiting-body isolates was found to be a recently diversified kin group composed of 3 to 10 genotypes on lineages radiating from a unique ancestral genotype (Fig. 1). Forty-six total mutations were identified among all sequenced descendants of the six group ancestors (tables S1 to S4), 24 of which were in nine lineages with multiple mutations (Fig. 1 and table S5). Ancestral genotypes were inferred by comparing polymorphisms within and among groups (tables S6 to S11) in the context of broader phylogenetic relationships (fig. S2). Notably, our 120 isolation-control genomes showed only a small fraction of the synonymous mutations [expected to generally be neutral (14)] present in the 120 natural-isolate genomes (1 versus 8, ~13%), as well as a similarly small fraction of total mutations (8 versus 46, ~17%) (Fig. 1, fig. S3, and tables S12 and S13). From this, we infer that a large majority of mutations in the original isolates were present in cells in soil before sampling, whereas a small minority are likely to be attributable to mutation during growth in the laboratory isolation procedure.

Fig. 1 Phylogenetic networks of six Myxococcus xanthus fruiting-body kin groups.

Distinct genotypes (circles) are separated by one or more mutational steps, with each connecting line between two adjacent circles representing one mutation. Ancestral genotypes are represented by central-hub circles with a thick border. Circle sizes reflect the number of clones of each genotype (labeled c01 to c48), and small black circles denote genotypes not present among the sampled genotypes (labeled G01 to G46). Green and orange circles, respectively, indicate significantly increased or decreased swarming speed relative to the group ancestor, pink circles indicate that swarming rate and spore production both decreased, and blue circles represent increased swarming associated with decreased spore production. Mutations in selection-hotspot loci are represented in large, bold font; synonymous changes are marked by asterisks. Brackets indicate that the temporal order of specific mutations on the respective lineages is unknown. Branches where selection hotspot mutations are associated with phenotypic changes are colored the same as the corresponding genotypes. Reference genomes sequenced to closure are underlined.

Nearly half of all mutations (20 of 46, 43%) in the natural isolates were found in six genes and a CRISPR locus that were each mutated distinctly in more than one lineage (Fig. 1, Table 1, and tables S14 to S16). Three protein-coding genes (Mxan_5852, rpsA, and rfbA) showed changes in multiple lineages derived from the same fruiting body; i.e., within kin groups KF3.2.8, KF4.3.9, and MC3.5.9, respectively (Fig. 1 and tables S2 and S16). Additionally, a type IC CRISPR-Cas system (15) evolved independently in six lineages of MC3.5.9 (Fig. 1 and fig. S4). CRISPR-Cas systems typically provide immunity against invasive genetic elements and evolve by consecutive additions of spacers (15). Whereas such spacers often derive from phage or plasmids (16), all eight newly acquired spacers in the MC3.5.9 lineages appear to derive from bacterial genera outside the myxobacteria (fig. S4), which suggests that M. xanthus is acquiring spacers from its prey. Finally, three genes (Mxan_6755, frzF, and rfbB) each incurred distinct mutations in two lineages sampled from different fruiting bodies (Fig. 1 and tables S2 and S16).

Table 1 Comparison of mutations at selection hotspots versus other loci.

SNPs, single-nucleotide polymorphisms.

View this table:

Assuming, for simplicity, that mutations occurred only in the 6186 genes of the ~7.15-Mb core genome (fig. S5 and tables S17 and S18), the probability of 2 of the 40 non-CRISPR mutations randomly occurring in the same gene is very small (P = ~2.7 × 10−5, exact binomial test). The chance of randomly observing 14 of those 40 mutations in the six genes that are mutated distinctly in at least two isolates is infinitesimally small [P = ~5.9 × 10−33, exact binomial test (13)]. This nonrandom distribution of mutations is compelling evidence that selection has strongly shaped the observed genetic variation (14), and we thus refer to loci that evolved multiple times as “selection hotspots” (17). The complete absence of synonymous changes at selection hotspots (0 of 14 mutations, excluding CRISPR insertions) stands in stark contrast to the observation that 31% of mutations at non-hotspot loci (8 of 26) were synonymous (P = 0.034 for difference, Fisher’s exact test). This observation further indicates that selection has targeted these hotspots. Notably, no gene was mutated more than once among the 120 isolation-control genomes (table S13), and none of the eight mutations in those controls (table S13) occurred in a locus mutated among the natural isolates (table S2).

Most mutations in selection hotspots (again excluding CRISPR) are associated with phenotypic changes relative to group ancestors (12 of 14, 86%) (Fig. 1), which further suggests adaptive evolution. More broadly, among the 34 sampled genotypes that evolved from the six group ancestors, 18 (53%) showed a significantly different rate of soft-agar swarming [predominantly driven by the type IV pili (TFP) motility system (18)] than their group’s ancestor [8 faster and 10 slower; P < 0.05 based on one-way analyses of variance (ANOVAs) and post-hoc Tukey tests] (Fig. 1 and fig. S6). Moreover, six evolved genotypes showed reduced sporulation relative to their group ancestor (Fig. 1 and fig. S7) (P < 0.05, one-way ANOVAs and post-hoc Tukey tests), which, notably, was in all cases associated with evolutionary changes in swarming rate, five of which were in the same direction (negative). Of 17 instances of significant phenotypic evolution (excluding genotype G39, which has the same phenotype as G38), 12 (71%) were associated with a non-CRISPR hotspot mutation. Further, all six non-CRISPR hotspots were associated with significant social-phenotype evolution in at least one lineage. The nonrandom association between social-trait evolution and selection hotspots among our natural isolates further strengthens the view that selection promoted much of the observed genomic diversity and strongly targets genes that contribute to social traits. No phenotypic changes were associated with any of the eight mutations found among the 120 isolation-control genomes (figs. S8 and S9) (P > 0.05, one-way ANOVAs and post-hoc Tukey tests), suggesting, together with additional controls from a previous study (7), that most or all of the phenotypic evolution within the original fruiting-body groups occurred before soil sampling.

Several hotspot loci were previously known to affect M. xanthus development and/or TFP-mediated social motility. These include the histidine kinase gene Mxan_5852 (19), the O-antigen-transport operon rfbABC (20), and the methyltransferase gene frzF, which contributes to control of cell-reversal frequency (21). Intriguingly, both frzF and Mxan_5852 also evolved in parallel in multiple experimental lineages of M. xanthus under selection while swarming (22).

We experimentally tested the functional significance of one genetic polymorphism associated with large phenotypic differences between two isolates that differ at rfbA. The distinct rfbA alleles in MC3.5.9c15 and MC3.5.9c29 are associated with faster and slower swarming, respectively, as well as with elevated and reduced spore production, respectively. When the native allele of each strain was altered to the other allele by clean genetic exchange, the relative swarming rates and sporulation levels of the resulting mutants strongly reversed rank in comparison with the parental strains (Fig. 2 and figs. S10 and S11).

Fig. 2 Phenotypic effects of exchanging distinct rfbA alleles between two divergent clones within the same fruiting body group (MC3.5.9).

Clones c15 (green) and c29 (pink) differ substantially in swarming rates (left wild-type panels), ability to form fruiting bodies (right wild-type panels), and sporulation. A clean genetic exchange of rfbA alleles reversed these phenotypes (see also figs. S10 and S11).

Inferring the significance of phenotypic variation among microbes observed in the laboratory for fitness in complex natural habitats is notoriously challenging. Nonetheless, the patterns of diversity we have documented at least suggest that the selective forces shaping this diversity were, in some cases, similar among different kin groups but, in other cases, were idiosyncratic. Three selection hotspots (Mxan_6755, frzF, and rfbB) were mutated in two different fruiting-body groups (Fig. 1), with one instance of combined genotypic-phenotypic convergence in which mutations in the rfbABC operon were seen to reduce the swarming rate in lineages of both MC3.5.9 and GH5.1.9. However, for the cross-group hotspot frzF, the two mutations in this locus were associated with directionally opposite changes in swarming rate (Fig. 1). Three of four within-group selection hotspots were specific to a single kin group (i.e., Mxan_5852 in KF3.2.8, rpsA in KF4.3.9, and the CRISPR locus in MC3.5.9), further indicating that idiosyncratic selection had occurred. Additionally, each group’s qualitative pattern of phenotypic evolution (or lack thereof in GH3.5.6) was distinct, and even qualitatively similar phenotypic changes (e.g., increases in swarming rate) often differed in their genetic basis among groups.

Owing to their general neutrality and approximately linear accumulation over time (14), synonymous mutations provide insight into the relative evolutionary ages of the natural kin groups. Only one synonymous mutation was found among the 120 clones from six isolation-control fruiting bodies, whereas eight were found among the 120 clones isolated from fruiting bodies on natural soil samples, indicating an approximately eightfold difference in the average coalescence age (i.e., generations since divergence from a common ancestor) of the original versus control kin groups. We estimate that the control lineages underwent ~33 generations of growth during the isolation protocol (13), indicating that the cell groups that founded our focal fruiting bodies from natural soil were, on average, ~230 generations removed from their respective group’s most recent common ancestor at the time the soil was sampled (although these ages should vary among fruiting-body groups, given their different network structures). This average coalescence-age estimate is approximately fivefold greater than the ~44 generations of cell doublings needed for an adult human body composed of ~4 × 1013 cells to develop from a zygote and resequester the germ line (23). Thus, it appears that functional multicellular cooperation in myxobacteria can be maintained over generational periods between purges of diversity from local lineage clusters that are substantially longer than periods between single-cell purges in metazoan lineages.

Our findings, combined with those of previous studies (7, 911), suggest that myxobacteria do not neatly fit the classification of multicellular systems into “staying together” or “coming together” types (24). Rather, our data suggest that endemically diversified, yet closely related, lineages of myxobacteria stay together spatially long enough to potentially allow multiple rounds of coming together for development and intermittent motility without extensively intermixing among genetically differentiated lineage clusters (7). Such extended spatial association allows diversified lineages within groups to repeatedly interact and potentially coevolve. This model of the spatiotemporal dynamics of M. xanthus diversity is strengthened by several features of natural isolates that are expected to reduce migration among kin-group clusters (3). These features include cell-cell adhesion (25), pervasive intercolony kin discrimination that can reduce social mergers (9, 11), and strong positive frequency dependence of fitness between genotypes from different fruiting bodies (10).

Several modes of selection operating at the level of individual cells could promote diversity within lineage clusters. These include shifting fitness ranks among spatiotemporally variable environments (26) and frequency-dependent fitness relationships (27), such as social cheating (3) and nontransitivity (e.g., “rock-paper-scissors” fitness scenarios) (28). However, selection might also operate at a higher level among multicellular collectives, such as swarming colonies (4) or fruiting bodies (12). Such selection operating among internally diverse cell-lineage clusters that are differentiated at the group level (6) could favor some combinations of alleles that interact between same-group members to affect collective fitness (social epistasis) relative to other group-level allelic states. Consistent with this hypothesis, a recent study has shown that interactions between M. xanthus clones sampled from the same fruiting body in soil often increase total group spore productivity, whereas mixing clones from distinct fruiting bodies tends to reduce productivity (12). These observations point to selection disfavoring within-group allele combinations that decrease group-level productivity. Given the potential for combinations of diversified kin lineages to remain together over long periods, selection can operate on genotype-combination effects, whether negative or positive, in ways that it cannot when groups are regularly broken up. This may result in synergistic coevolution among within-group lineages (6). Finally, social epistasis may also contribute to the evolution of Dobzhansky-Muller–like social incompatibilities among lineage clusters (29) and the divergence of developmental programs (30).

Supplementary Materials

Materials and Methods

Figs. S1 to S11

Tables S1 to S21

References (3262)

References and Notes

  1. See supplementary materials.
Acknowledgments: We thank H. Keller and C. Capelli for technical assistance; A. Patrignani and G. Russo (Functional Genomics Center Zürich) for PacBio sequencing and assembly; C. Beisel (Genomics Facility Basel) and S. Engledow (Oxford Genomics) for Illumina sequencing; S. Zoller, A. Minder Pfyl, and S. Kobel (Genetic Diversity Center, ETH Zürich) for bioinformatics and lab support; Y.-T. N. Yu for experimental advice; and S. O’Brien, S. Pande, J. Stapely, M. Vasse, and D. Queller and all other reviewers for helpful comments on the manuscript. Funding: This work was supported by an EU-Marie-Curie PEOPLE Postdoctoral Fellowship (FP7-PEOPLE-2012-IEF-331824) to S.W. and an SNSF grant (31003A_16005) to G.J.V. Author contributions: S.W. and G.J.V. designed and supervised the research and provided financial support; G.J.V. provided laboratory materials and lab strains; S.W., R.W., F.F., and G.J.V. conducted the experiments; S.W. and R.W. conducted clean allele exchange experiments and construct phenotyping; S.W. performed additional phenotyping of natural isolates; S.W., F.F., and G.J.V. performed isolation control experiments; F.F. conducted phenotyping of control isolates; S.W. handled and analyzed the genomic data and prepared the figures; S.W., R.W., L.S., and G.J.V. applied statistical and computational evaluation of the data; and S.W. and G.J.V. wrote the manuscript, with input from all coauthors. All authors reviewed and commented on the manuscript. Competing interests: The authors declare no competing interests. Data and materials availability: We deposited the following sequences at GenBank under the listed accession numbers: whole reference-genome sequences (CP017169 to CP017174); whole-genome raw reads (all Sequence Read Archive numbers are part of BioProject accession PRJNA342411 and listed in tables S19 to S21: SRR8298021 to SRR8298026; SRR8299451 to SRR8299564; SRR8300452 to SRR8300577; and SRR8357276 to SRR8357325); and Sanger-sequencing data for CRISPR-Cas arrays (MK446156 to MK446163). All other raw data and code are accessible at (31).

Stay Connected to Science

Navigate This Article