Research Article

Uncovering the essential genes of the human malaria parasite Plasmodium falciparum by saturation mutagenesis

See allHide authors and affiliations

Science  04 May 2018:
Vol. 360, Issue 6388, eaap7847
DOI: 10.1126/science.aap7847
  • Saturation-scale mutagenesis of P. falciparum reveals genes essential and dispensable for asexual blood-stage development.

    (Top) A high-resolution map of a ~50-kb region of chromosome 13 depicts an essential gene cluster, including K13, that lacks insertions in the coding DNA sequence (CDS) but is flanked by dispensable genes with multiple CDS-disrupting insertions. (Left) The MIS rates the potential mutability of P. falciparum genes based on the number of recovered CDS insertions relative to the potential number that could be recovered through large-scale mutagenesis. (Right) The MFS rates the relative fitness of P. falciparum genes based on QIseq scores of transposon insertion sites in each gene.

  • Fig. 1 A genome-wide saturation mutagenesis screen for Plasmodium falciparum.

    (A) Chromosomal map displays 38,173 piggyBac insertion sites from all mutants evenly distributed throughout the genome. (B) High-resolution map of a ~50-kb region of chromosome 13 depicts an essential gene cluster, including K13, flanked by dispensable genes with multiple CDS-disrupting insertions. (C) High-resolution map of a ~20-kb region without insertions includes three conserved genes of unknown function (PF3D7_1232700, PF3D7_1232800, and PF3D7_1232900) and a putative nucleotidyltransferase (PF3D7_1232600) (fig. S5). (D) A plot of all piggyBac insertions revealed that significantly fewer insertions were recovered from exon-intron regions compared with the proportion of available TTAA sites (fig. S1D) (P < 2.2 × 10–16, Fisher’s exact test). (E) Density of piggyBac insertion-site distribution revealed 75% fewer insertions recovered in transcriptional regions (blue) than intergenic 5ʹ (yellow) and 3ʹ (green) regions, depicted as relative distance upstream and downstream to a gene, respectively. (F) This study determined that under ideal culture conditions for asexual blood-stage growth, 38% of genes in the P. falciparum genome have mutable CDSs, whereas 62% of genes have nonmutable CDSs, which includes 12% with tentative classification.

  • Fig. 2 Identification of dispensable and essential genes through MIS and MFS.

    (A) The MIS rates the potential mutability of P. falciparum genes based on the number of recovered CDS insertions relative to the potential number that could be recovered. Genes known as dispensable or essential are highlighted. (B) MIS violin plots of GO processes grouped from lowest to highest dispensability, according to gene functional annotations. (C) MIS plots and (D) high-resolution chromosome maps highlighting important genes of interest for RNA metabolism (–20 kb, +20 kb) (MIS plots of other genes of interest are provided in fig. S4). (E) The MFS estimates the relative growth fitness cost for mutating a gene based on its normalized QIseq sequencing reads distribution. (F) MIS has significant correlation to MFS (Pearson’s R = 0.67, P < 2.2 × 10–16 compared with permutation). (G) The first and second MFS quartiles were composed primarily of nonmutable genes, the fourth quartile was composed mostly of mutable genes, and the third quartile had nearly equal numbers of both.

  • Fig. 3 Validation of mutagenesis score through phenotype screen.

    (A) Competitive growth assays of asexual blood-stage growth under ideal in vitro culture conditions. Phenotypes of four independent mixed-population pools grown for three cycles confirmed that losers (left, bottom quantile) and winners (right, top quartile) had significantly different MIS. (B) Overall rank-ordered plot of competitive growth phenotypes shows losers and winners. (C) Competitive growth losers had significantly lower MISs and MFSs, respectively, validating MIS and MFS as predictors of gene essentiality and dispensability. (D) Circos plot from outer to inner shows the distribution of all piggyBac insertions, MIS (pink indicates MIS < 0.5, and blue indicates MIS > 0.5), CDS insertions, and MFS along each chromosome of P. falciparum genome. (E) Violin plots indicate nonmutable genes had significantly lower MIS and MFS (Wilcoxon, ****P < 2.2 × 10–16).

  • Fig. 4 Chromosomal syntenic breakpoints are enriched in dispensable genes.

    (A) Genes within conserved syntenic blocks have a significantly lower MIS and MFS (Wilcoxon P < 2.2 × 10–16). Syntenic genes or “syntenic block” is defined as at least three genes in the same order on the same chromosome as their orthologs in another species within a 25-kb search window. (B and C) Scatter plots show the insertion site enrichment along two syntenic breakpoints [chromosome 13 (Ch13), 2,110,000 to 2,135,000; Chr10, 642000 to 666000]. Each gap in synteny (white area) is enriched for piggyBac insertions while flanked by essential regions (green shading); black boxes represent the location of CDS. (D and E) Circos plots indicate the syntenic blocks of P. falciparum in relation to other Plasmodium spp. (P. berghei, P. chabaudi, P. knowlesi, and P. vivax).

  • Fig. 5 Distinct biological process and evolutionary conservation segregate the tendency of dispensable and essential genes.

    (A) The genes with lowest FPKM expression value (first quantile) among different stages were enriched for dispensable genes (Wilcoxon P < 2.2 × 10–16 compared with other quantiles) (26). The expression level cut off is set at 20 FPKM. (B) Nonmutable essential genes had significantly higher expression value for blood-stage development. (C) The group of trophozoite-stage genes had the highest proportion of essential genes (red), whereas gametocyte genes had the highest proportion of dispensable genes (blue) (Wilcoxon P < 1 × 10–12). (D to F) Characteristics of essential genes significantly different from dispensable genes include (D) 1:1 ortholog conserved among Plasmodium spp., (E) absence of paralogs, and (F) reduced rate of nonsynonymous to synonymous single-nucleotide polymorphisms. Bars indicate the group median (Wilcoxon, ****P < 2.2 × 10–16). (G and H) Essential genes reported in (G) Toxoplasma and (H) P. berghei showed significantly lower MIS in this mutagenesis screen of P. falciparum (Wilcoxon, ****P < 2.2 × 10–16). (I) Plot of receiver operating characteristics (ROC) indicate the level of retention of essential genes across species. The MIS of P. falciparum more strongly correlates with the essentiality phenotype of P. berghei than Toxoplasma.

  • Fig. 6 Differentiating dispensable and essential genes and discovering high-priority druggable targets and pathways.

    (A) Functional annotations of biological processes are represented by the P value, and the x axis shows the fraction of the genes with MIS > 0.5. Each GO term is assigned a P value on the y axis to represent the tendency to be essential or dispensable. Essentiality is indicated on a spectrum of red (essential) to blue (dispensable) and circle sizes indicate the GO term enrichment. (B and C) Boxplot of (B) molecular processes and (C) cellular components shows the MIS distribution generated by 1000× sampling of the number of genes in the query GO-term category. Left (red) and right (blue) triangles indicate GO terms with significantly lower or higher MIS (P < 0.05 compared with background), respectively; the heatmap represents the essentiality defined as the fraction of genes per GO term with MIS > 0.5.

Supplementary Materials

  • Uncovering the essential genes of the human malaria parasite Plasmodium falciparum by saturation mutagenesis

    Min Zhang, Chengqi Wang, Thomas D. Otto, Jenna Oberstaller, Xiangyun Liao, Swamy R. Adapa, Kenneth Udenze, Iraad F. Bronner, Deborah Casandra, Matthew Mayho, Jacqueline Brown, Suzanne Li, Justin Swanson, Julian C. Rayner, Rays H. Y. Jiang, John H. Adams

    Materials/Methods, Supplementary Text, Tables, Figures, and/or References

    Download Supplement
    • Figs. S1 to S11
    • Tables S1 to S9
    • References
    Table S1
    piggyBac insertion sites identified from preliminary study.
    Quantitative Insertion-site Sequencing (QIseq) identified 3651 piggyBac insertion sites distributed across all fourteen chromosomes, combined with 326 previous published mutagenesis results shown on this table.
    Table S2
    QC samples in each of QIseq run.
    128 pooled mutants with known pB insertion sites identified in previous study were prepared in advance as QC samples for monitoring accuracy and depth of each QIseq run.
    Table S3
    piggyBac insertion sites identified from this study.
    34,522 piggyBac insertion sites identified from this study. A total of 38,173 piggyBac insertion sites were used for determining saturation-level mutagenesis scores (MIS and MFS) including preliminary study data (Table S1).
    Table S4
    The non-mutable genes located in essential blocks.
    We observed that some non-disrupted genes were completely devoid of insertions even in the surrounding intergenic regions. 143 non-mutable genes (representing 2.9% of genes) are located in essential blocks, which are defined as having an insertion-free gap > 10 kb.
    Table S5
    MIS and MFS identify essentiality of the genes.
    Mutagenesis index score (MIS) was calculated based on the susceptibility of the ORF in each transcriptional unit to being disrupted (Fig. 2A). Mutagenesis Fitness Score (MFS) which is calculated by the normalized reads number (Fig. 2E). Two measuring models (MIS and MFS) were used to validate essentiality of the genes of Plasmodium falciparum genome.
    Table S6
    RNA metabolism, related to Fig. 2 C, D and fig. S6.
    RNA metabolism genes by compartment or process as displayed in figure S7. MIS is indicated from 0 to 1. Genes in red text indicate those with TTAA density <7 or length <400bp (that are thus not scored) that have 0 recovered gene-body insertions. Translation and splicing-related genes of interest were identified via GO term. RNA granule-related genes of interest are as classified in Reddy et al. 2015 and as reviewed (41). Unknown, validated RNA-associated genes are proteins identified as being mRNA-bound as per (42) that have no or little functional annotation as per PlasmoDB.
    Table S7
    Phenotype screen of competitive growth assay, related to Fig. 3 A to C.
    Phenotype screen of competitive growth assay identifies growth winners and losers from ~400 mutants in four individual pools which provides evidence to validate MIS and MFS scoring of each gene's essentiality.
    Table S8
    Genes corresponding to GO enrichment.
    Functional annotation of biological processes, molecular function and cellular component are represented by the p-value and the fraction of the genes with MIS > 0.5. Each GO term is assigned a p-value to represent the tendency to be essential or dispensable.
    Table S9
    Sample pool ID and accession number for this study, related to Fig. 1, 2 and table S3.
    Sample pool ID and accession number from all QIseq runs are shown on this table. Insertion sites identified from each pool are shown on table S3.

Navigate This Article