Variation in the AvrSr35 gene determines Sr35 resistance against wheat stem rust race Ug99

Puccinia graminis f. sp. tritici (Pgt) causes wheat stem rust, a devastating fungal disease. The Sr35 resistance gene confers immunity against this pathogen’s most virulent races, including Ug99. We used comparative whole-genome sequencing of chemically mutagenized and natural Pgt isolates to identify a fungal gene named AvrSr35 that is required for Sr35 avirulence. The AvrSr35 gene encodes a secreted protein capable of interacting with Sr35 and triggering the immune response. We show that the origin of Pgt isolates virulent on Sr35 is associated with the nonfunctionalization of the AvrSr35 gene by the insertion of a mobile element. The discovery of AvrSr35 provides a new tool for Pgt surveillance, identification of host susceptibility targets, and characterization of the molecular determinants of immunity in wheat.


Confocal microscopy
Staining: Infected leaves were fixed and cleared at room temperature in 95% ethanol for 5 days. Leaf tissues were washed twice by incubating for 15 min in 50% ethanol followed by 15 min incubation in 0.5M NaOH. Then samples were washed three times with water and kept for 30 min in 50 mM Tris-HCl buffer (pH 5.8). Staining of leaf samples was performed by incubating them for 5 min in 2 ml of 0.1% Uvitex 2B (Polysciences Inc.) followed by washing three times with ddH 2 O. To stain the plant cell walls, samples were treated with 0.1% (w/v) acridine orange (Sigma) for 5 min. The excess of dye was removed by overnight incubation in 5 ml of 25% (v/v) glycerol.
Fluorescence microscopy was carried out using a LSM 780 confocal laser scanning microscope (Zeiss) at 40X magnification. Uvitex 2B and acridine orange were detected by excitation at 405 nm and scanning with emission filters at 411-485 and 550-560 nm, respectively. Image data processing was carried out using Zeiss ZEN (Zeiss Efficient Navigation) 2012 v8.1 and ImageJ software. At least 10 fungal infection sites in three biological replicates were analyzed for each stained leaf sample.
To detect dead cells in the leaf tissues, samples were stained for 15 min in a solution containing 20 ug/ml WGA-FITC dissolved in 50 mM Tris-HCl buffer (pH 7.5). Then samples were washed three times with 50 mM Tris-HCl buffer (pH 7.5), stained in 10 µg/ml water solution of propidium iodine (Life Technologies) followed by rinsing three times with ddH 2 O. Fluorescence microscopy was carried out using a LSM 780 confocal laser scanning microscope (Zeiss) at 40X magnification. WGA-FITC and propidium iodine were detected by excitation at 488 and 561nm and scanning with filters at 493-584 and 584-718 nm, respectively ( Fig. 1). At least 10 fungal infection sites in three biological replicates were analyzed for each stained leaf sample.
The analysis of Sr35-based pre-haustorial resistance against stem rust Two genotypes of wheat with (U6169) and without (Morocco) the Sr35 gene were inoculated with Sr35-avirulent Pgt race RKQQC. At least 10 fungal infection sites in three biological replicates were analyzed for each stained leaf sample. The leaf tissues were analyzed using a confocal microscopy 24 and 48 hours after inoculation (HAI). The staining with Uvitex 2B in combination with orange acridine was used to detect the fungal infection structures including substomatal vesicles (SSV), haustoria (H), haustorial mother cells (HM), and infection hyphae (IH) (Fig. 1). The staining with WGA-FITC and propidium iodine was used to identify the fungal tissues and the presence of dead cells in the leaf mesophyll of the infected plants ( Fig. 1E and Fig. 1F). The collapse and increased fluorescence of individual plant mesophyll cells in close proximity to haustorial mother cells (Fig. 1E), and the staining of nuclei in these mesophyll cells were considered as the evidence of cell death (12).

EMS mutagenesis of Pgt spores
The Sr35-avirulent Pgt isolate 99KS76A-1 (RKQQC race) was used to create to inoculate one thousand 12-day old seedlings of Triticum monococcum accession G2919 (Sr35+) that was previously used for cloning the Sr35 gene (6). Seedlings were incubated in a dew chamber at 20°C for 12 hours in dark followed by 1-hour incubation under light. Finally, seedlings were transferred to a growth chamber at 22°C with a 16hour light and 8-hour dark photoperiod. The presence of pustules on the seedlings was evaluated 15 DAI. The spores were collected separately from 15 pustules, resuspended in Soltrolâ 170 (Chevron Phillips Chemical, TX), and each suspension was used to re-infect the seedlings of G2919 accession. This step was repeated twice for each mutant Pgt isolate. To ensure that plants inoculated by different mutagenized Pgt strains are not cross-infected by other rust spores, seedlings before transferring to growth chamber were covered by breathable cellophane bags. Each mutagenized strain of Pgt was multiplied by infecting susceptible wheat cultivar Morocco and validated for virulence to the Sr35 gene by re-infecting the G2919 accession. Randomly selected set of mutants was pathotyped by infecting wheat lines that carry different known Sr genes (13) (Tables S1, S2).
To ensure that the obtained Sr35-virulent mutant strains were not escapes of other rust strains, and to validate that only Sr35-virulence was affected by EMS mutagenesis, five randomly selected mutants (M1, M7, M6, M9, M11) were analyzed by infecting a standard set of 20 differential wheat lines that carry different Sr genes (Table S1). This panel is broadly used for quick pathotyping of unknown Pgt isolates using the standard 5letter code (13). In addition, the wheat cultivar Morocco (Sr35-) and G2919 accession of T. monococcum (Sr35+) were included as controls. For Pgt mutants M1 and M7, we assessed the virulence profiles on an expanded panel of wheat lines carrying different Sr genes (Table S2).
The effect of AvrSr35 gene mutations on Pgt virulence on the susceptible host The confocal microscopy of wheat leaves from compatible cultivar Fielder infected with the wild-type and mutant Pgt strains was performed to investigate the effect of the AvrSr35 gene mutations on the pathogen's virulence. Four 2-week seedlings were infected with each of the four Pgt strains (wild-type, M1, M4 and M7). Leaf samples were collected three days after infection and stained using Uvitex 2B in combination with orange acridine, as described above, and analyzed using the confocal microscope (Fig.   S1).
The confocal microscopy analysis was also performed on the leaf samples collected 4 days after infection and stained for 15 min in a solution containing 20 ug/ml WGA-FITC dissolved in 50 mM Tris-HCl buffer (pH 7.5). For each detected fungal infection unit, we used ImageJ software to assess the area (in pixels) corresponding to the stained fungal structures. The average area was estimated for each plant based on the analysis of 3-5 infection units. The analysis of variance (ANOVA) was performed to test for differences in the fungal structure area among the leaf samples infected with the wildtype and mutant Pgt strains (Table S3).
We have also compared the growth of three EMS-mutants (M1, M4, and M7) and wild type Pgt strain by estimating the average sizes of uredia (pustules), the fungal structure producing spores, from images collected from the infected leaves of Fielder cultivar 15 days after infection. Four infected plants per Pgt strain were analyzed. Images were analyzed using the ASSESS (version 2) image analysis software for plant disease quantification from the American Phytopathology Society. The statistical significance of differences in the uredia sizes among the plants infected with the wild-type and mutant Pgt strains was tested using ANOVA (Table S3).

RNA-seq analyses of susceptible wheat transcriptome infected with the wild type and mutant strains of Pgt
To assess the effect of AvrSr35 nonsense mutations on the Pgt interaction with the susceptible wheat host, we performed time-course analysis of leaf transcriptomes obtained by infecting the susceptible wheat cultivar Fielder with either the wild-type Pgt isolate 99KS76A-1 or one of the three EMS mutants M1, M4 and M7 (Table S4). Total RNA was extracted from the infected leaf tissue at 24, 48, and 96 HAI as described previously (8). One microgram of high quality total RNA was taken for RNA sequencing (RNA-seq) library construction using the TruSeq RNA Sample Preparation kit v2 The quality trimming was performed according to the previously described protocols (8). The transcript abundance for the gene models from the wheat reference genome TGACv1 (14) was estimated from RNA-Seq data using kallisto (15) (NCBI GEO GSE106397). The analyses of differential gene expression were carried out using the R package sleuth that combines bootstrapping with linear modeling to estimate biological variance separately from inferential variance.
Differentially expressed genes were identified between the mock and wild-type Pgt inoculated wheat leaves collected at 24, 48 and 96 HAI, respectively (FDR £ 0.05) (Fig. S2). To characterize the wheat host biological pathways affected by interaction with the wild-type Pgt, we performed GO term enrichment analyses of differentially expressed genes and summarized results by clustering GO terms based on semantic similarity measures implemented in the REVIGO tool. This data was used to identify the wheat host genes involved in primary biotic defense stress responses during the compatible interaction with the pathogen (Fig. S2 and Table S5) (8,16,17,3).  (Table S4). The two FDR levels, 0.01 and 0.05, have been used to identify the differentially expressed genes (Fig. S2).
Sequencing the genome of Pgt isolate 99KS76A-1 (race RKQQC) To identify the AvrSr35 effector we have mutagenized the American Pgt isolate 99KS76A-1 collected in 1999 in Kansas and classified as race RKQQC. Both RKQQC and Ug99 races of Pgt are avirulent to the Sr35 gene (6). Before sequencing, the Pgt culture was purified by single-pustule isolation and pathotyped according to North American nomenclature system (13).
About 500 mg of freshly collected urediniospores were frozen in liquid nitrogen and ground using a mortar and pestle. Fungal DNA was isolated using the Omni Prep for This method was used to sequence Pgt isolate 99KS76A-1 and the 15 Pgt mutants (Tables S6, S7).
The 454 sequencing library was prepared from 500 ng of genomic DNA with the GS FLX Titanium Rapid Library Preparation kit (Roche). The sample was sequenced with one 454 run using the Titanium chemistry at the K-State IGF followed standard Roche protocols (Table S6). The data is deposited to NCBI PRJNA313186.
The 3-10 kb PacBio libraries were constructed, quantified, and sequenced on one SMRT cell of PacBio RS II using P6C4 PacBio chemistry at UC Davis Genome Center (Table S6).
Illumina sequence data was processed using FASTX-Toolkit to remove adaptors, low quality bases (<20) and low-quality reads with less than 70% of bases having quality ³20. Reads produced by 454 sequencing were adaptor-and quality-trimmed using the program Lucy with the default settings (sourceforge.net/projects/lucy).
Illumina paired-end reads generated for the Pgt isolate 99KS76A-1 were assembled using DISCOVAR de novo with the default parameters. To extend contigs, data generated using the Roche 454 and PacBio platforms were added using program SSPACE with "Minimum alignment length" = 100 and "Minimum identity of the alignment = 70". Contigs longer than 1,000 bp have been retained. The contaminating sequence assemblies were excluded by performing BLASTN search in the NCBI's nonredundant sequence database and retaining only contigs with the best hits to fungal sequences (e-value less than 1e -10 and alignment length more than or equal to 100 bp).
The genome assembly and annotation are available from the NCBI PRJNA313186 and project website http://wheatgenomics.plantpath.ksu.edu/rustgenomics/.

Annotation of Pgt genome assembly
For predicting genes in the Pgt genome, we performed de novo assembly of RNA-Seq data (BioProject PRJNA415853) generated for total RNA isolated from the Pgtinfected leaf tissues (described in the next section). For RNA extraction, twelve day-old seedlings of susceptible wheat cultivar 'Fielder' were inoculated with the urediniospores of Pgt isolate 99KS76A-1, as previously described (6). Infected leaf tissues were collected at 24, 48, and 96 HAI and used for RNA-seq analysis.
Illumina 2 x 100 bp paired-end reads were quality filtered and assembled using program Trinity (18) with the following parameters: "--genome_guided_max_intron 50000 --SS_lib_type FR -jaccard_clip". The assembled contigs were used for the Pgt genome annotation. In addition, for the Pgt genome annotation, we have used 1) 15,979 gene models reported for the genome of Pgt strain CDL 75-36-700-3 (9), and 2) 22,321 transcripts reported for the Australian Pgt isolates. Gene models were predicted by mapping the transcripts to the Pgt genome assembly using the PASA pipeline (19). The coding regions of genes were predicted using the TransDecoder tool of the PASA pipeline (NCBI database PRJNA313186 and project website http://wheatgenomics.plantpath.ksu.edu/rustgenomics).
To estimate the completeness of genome assembly, the BUSCO program was utilized (20). BUSCO uses a set of 1,441 phylogenetically conserved genes to assess the proportions of completely and partially sequenced genes in a genome or transcriptome.

Re-sequencing 15 Pgt mutant strains and mutation discovery
The BWA-MEM program with the default parameters was used to align reads generated for 15 mutants and the wild-type strain to the reference genome assembly of Pgt isolate 99KS76A-1 (NCBI PRJNA313186) (Table S7).
Before variant calling, the reads in BAM files were locally re-aligned using the GATK (21) followed by the marking and removal of duplicated reads. The variant calling was performed using the GATK's UnifiedGenotyper (21) with the default settings. The raw variant calls were filtered to remove sites that had more than 2 allelic states or were monomorphic among the 15 mutant strains. Because EMS preferentially alkylates G residues, inducing primarily C-to-T and G-to-A transitions (22), only CG > TA mutations were selected. Additional filtering has been applied to read coverage data extracted for both wild-type and mutated allelic variants. Alleles were called only if they had coverage of at least two reads. The minimum coverage of 5 reads was used to call mutant alleles. To reduce the erroneous mutant calling due to the misalignment of duplicated paralogs or cross-contamination among the mutant Pgt strains during their isolation, the mutant sites that were detected in more than 3 different Pgt mutant strains have been excluded. In addition, the Fisher's exact test was applied to compare the depth of read coverage of mutant and wild-type alleles at each site in the wild-type and mutant strains. The sites showing the statistically significant difference at P-value £ 10 -4 were retained (Dataset S1, Table S8). All mutations in the top candidate gene were confirmed by Sanger sequencing.
Identification and characterization of the AvrSr35 gene candidate We searched for candidate genes that had multiple nonsense or other strong effect mutations in the coding sequences across all 15 Pgt mutants (Dataset S1). The mutations detected in a single top candidate gene (NCBI accession number MF474174) were validated by the Sanger sequencing of amplicons generated for each of the 15 fungal mutants ( Fig. 2A, Table S9). The pair of primers (Table S10) was designed to amplify the entire AvrSr35 gene coding region.
The AvrSr35 protein sequence (578 amino acids) was analyzed for the presence of a signal peptide cleavage site (Figs. 2A and S3). AvrSr35 was scanned for the presence of matches in the InterPro protein signature databases using InterProScan. AvrSr35 sequence matches in the NCBI protein database were searched using BLASTX. The size of the AvrSr35 protein was compared with the sizes of the previously characterized effector proteins from flax (23) and poplar rust (9).

Structural modeling of the AvrSr35 protein
To overcome the lack of the AvrSr35 homologs in the protein databases, we predicted the secondary structure of AvrSr35 ( Fig. S4) and compared it to a database of known protein 3D structures using the online I-TASSER server (24). I-TASSER uses a threading approach to match proteins to known 3D protein structures in The Protein Data Base (PDB), which as of 2017 contained over 138,000 deposited protein structures.

Diversity analyses of Sr35-virulent and Sr35-avirulent field strains of Pgt
To confirm the results of EMS mutagenesis, PCR amplicons for the AvrSr35 candidate gene were generated and sequenced from a diverse set of Pgt field isolates. These isolates were collected at different locations and years and validated as Sr35virulent (12 isolates) and Sr35-avirulent (15 isolates) (Table S11, Dataset S2).
The AvrSr35 candidate gene region including entire coding sequence was amplified from each isolate (see the list of primers in Table S10) and sequenced from both strands using Sanger approach with multiple sequencing primers spanning the entire gene. The Sanger reads generated using the ABI3730xl (Applied Biosystems) system were processed and assembled into contiguous sequences using Sequencher 4.8 (Gene Codes).
If PCR produced a single-band product on the agarose gel, it was sequenced directly. If isolate produced two or more (due to the presence of an additional paralogous gene copy in one of the two nuclei) PCR bands on the agarose gel, the PCR products were TAcloned into the pGEM®-T Easy vector followed by Sanger sequencing of four to eight independent clones.
Sequences were aligned using program MUSCLE with the default settings. The phylogenetic tree was constructed using the Neighbor-Joining method applying the maximum composite likelihood substitution model implemented in program MEGA 6.06.
The phylogenetic tree testing was performed utilizing the bootstrap method with 1,000 replicates. The sequence of highly divergent non-functional second allele (due to 1 bp deletion at 1,320 bp of CDS; NCBI accession number MF596174) of the AvrSr35 gene identified in isolate 99KS76A-1 was used to root the tree.
Infiltration of resistant (Sr35+) and susceptible (Sr35-) wheat lines with the AvrSr35 protein The coding region of the AvrSr35 gene candidate without the deduced signal peptide (SP) was amplified from the pGEMT-cDNA-AvrSr35 (Table S10) Cells were harvested by centrifugation at 10,000 x g for 15 min at 4°C. Cell pellet was suspended in 50 ml of lysis buffer (50mM sodium phosphate buffer pH 6.8, 300 mM NaCl, 10mM imidazole (Thermo Fisher Scientific), 10 mM phenylmethylsulfonyl fluoride (Thermo Fisher Scientific) and 0.05 mg of lyzosyme (Thermo Fisher Scientific).
The suspension was incubated for two hours in a cold room (~4 -6°C) and then was subject to three cycles of freezing and thawing using liquid nitrogen and a water bath set at 42°C. Cell lysate was harvested by centrifugation at 10,000 x g for 20 min at 4°C. The upper phase was transferred to a 50 ml tube and kept on ice.
The recombinant protein was purified using the HisPur Ni-NTA spin columns (Thermo Fisher Scientific). Columns were equilibrated using two resin-bed volumes of equilibration buffer (50mM sodium phosphate buffer pH 6.8, 300 mM NaCl, 10 mM imidazole) followed by centrifugation at 700 x g for 2 min at 4°C. The Ni-NTA resin was washed with 8 resin-bed volumes of wash buffer (50mM sodium phosphate buffer pH 6.8, 300 mM NaCl, 20 mM imidazole) followed by centrifugation at 700 x g for 2 min at 4°C. The recombinant protein was collected from the column by five successive elutions, using a bed volume of elution buffer (50 mM sodium phosphate buffer pH 6.8, 300 mM NaCl, 150 mM imidazole) followed by centrifugation at 700 x g for 2 min at 4°C. Total protein was quantified using Bradford method and analyzed in a 10% SDS-PAGE.
To obtain native protein, SUMO fusion was cleaved off by incubating 5 mg of purified recombinant protein with 100 Units of SUMO protease (Thermo Fisher Scientific) in 1X SUMO protease buffer for 16 hours at 4°C. Cleaved protein was analyzed in a Coomassie stained 10% SDS-PAGE to confirm the N-terminal cleavage (Fig. 3E). Cleaved protein was dialyzed using a Slide-A-Lyzer ™ Dialysis kit (Thermo Fisher Scientific). The Slide-A-Lyzer cassette containing the cleaved protein was subjected to overnight dialysis in a cold room with stirring in 5 L of ddH 2 0. After dialysis, the protein was purified using the HisPur Ni-NTA spin columns (Thermo Fisher Scientific). Columns were equilibrated with two resin-bed volumes of ddH 2 0, followed by centrifugation at 700 x g for 2 min at 4°C. Dyalized protein was passed through the column by centrifugation at 700 x g for 2 min at 4°C. Then, the flow-through fraction containing the AvrSr35 protein was frozen in liquid nitrogen and lyophilized for at least hours after infiltration (Fig. 3E).

Infiltration of N. benthamiana leaves with Agrobacterium
The full-length coding sequence of the candidate AvrSr35 gene was obtained by RT-PCR of total RNA isolated from the 2-week old wheat seedlings infected with the RKQQC race of Pgt. The entire coding region of AvrSr35 was amplified from the cDNA obtained after reverse transcription has been amplified using gene-specific primers (Table  S10) amplifying the entire coding region. The PCR product was subcloned into the pGEM5 vector and both strands of 5 independent plasmid clones confirmed to be identical using a Sanger sequencing approach. The clone was used for creating all of the derivatives of the AvrSr35 gene constructs.
The gene fragments were amplified using the Gateway-compatible attB-modified primers and cloned into the pDONR ™/Zeo vector (Thermo Fisher Scientific). To preclude the triggering of HR, the Sr35 gene constructs used for infiltration carried the K206L mutation in the P-loop of the NB-ARC domain (Sr35:p. K206L) (Table S12). This mutation has been shown to inhibit nucleotide hydrolyses and downstream signaling in other NBS-LRR resistance proteins, but should not interfere with target binding. Constructs with the Sr35 gene variant found in the susceptible EMSmutagenized T. monococcum line M1120 (R854H+W856R+T858S) and ER-nYFP+cYFP were used as negative controls (Fig. 4D). The Sr35 gene variants were created as C-terminal fusions with the cYFP (Table S12). The wild-type AvrSr35 gene without the signal peptide (27-578 AA) was an N-terminal fusion with nYFP (Table   S12). The level of construct expression in N. benthamiana leaves was tested by RT-PCR as described above. Also, to ensure that the decrease in fluorescence observed with the mutant Sr35 M1120 constructs is not due to reduction in protein stability, we tested the expression of these proteins by immunoblotting using the anti-GFP showing affinity to cYFP (Fig. S9).  (Fig. 4D). The analysis of variance (ANOVA) was used for assessing the effect of different construct combinations on the level of fluorescence followed by the 'post-hoc' Tukey's test to assess the significance of individual means (Fig. 4D).   peptide score (S-score) and combined cleavage site score (Y-score). The D-cutoff, a weighted average of the mean S-score and the maximum Y-score, is used to identify SPs.

Fig. S4
The secondary and tertiary structure modeling of the AvrSr35 protein using I-TASSER.
In the absence of the secretion signal peptide I-TASSER produced a secondary structure prediction of 25 helices and 6 strands interspersed with coil domains. The top predicted tertiary structure for wild-type AvrSr35 achieved a predicted template modeling score       Table S1.
A set of randomly selected five EMS mutants of the 99KS76A isolate (race RKQQC) tested on the panel of differential wheat lines carrying different Sr genes. <Provided as separate file: Table S1.docx > Table S2.
Infection types (IT)* on seedlings using the wild type Pgt RKQQC race and two EMS mutants of Pgt, M1 and M7, against the differential set of wheat lines carrying different Sr genes. ITs on a single leaf are separated by a comma; range of variation in ITs is shown by indicating the range without comma.   Table S9.
The positions of EMS mutations in the candidate AvrSr35 gene (scaffold 2351) identified by analyzing NGS and Sanger sequencing data (0 -wild-type base, 1 -mutant base).  Table S10.
List of primers used in the study (provided as a separate file).  Fig. 2B. Table S12.
Plasmid constructs used in the study.

Dataset S2 (separate file)
The sequences of the AvrSr35 gene for Sr35-virulent and Sr35-avirulent Pgt isolates.