Supplementary Materials

Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak

Stephen K. Gire, Augustine Goba, Kristian G. Andersen, Rachel S. G. Sealfon, Daniel J. Park, Lansana Kanneh, Simbirie Jalloh, Mambu Momoh, Mohamed Fullah, Gytis Dudas, Shirlee Wohl, Lina M. Moses, Nathan L. Yozwiak, Sarah Winnicki, Christian B. Matranga, Christine M. Malboeuf, James Qu, Adrianne D. Gladden, Stephen F. Schaffner, Xiao Yang, Pan-Pan Jiang, Mahan Nekoui, Andres Colubri, Moinya Ruth Coomber, Mbalu Fonnie, Alex Moigboi, Michael Gbakie, Fatima K. Kamara, Veronica Tucker, Edwin Konuwa, Sidiki Saffa, Josephine Sellu, Abdul Azziz Jalloh, Alice Kovoma, James Koninga, Ibrahim Mustapha, Kandeh Kargbo, Momoh Foday, Mohamed Yillah, Franklyn Kanneh, Willie Robert, James L. B. Massally, Sinéad B. Chapman, James Bochicchio, Cheryl Murphy, Chad Nusbaum, Sarah Young, Bruce W. Birren, Donald S. Grant, John S. Scheiffelin, Eric S. Lander, Christian Happi, Sahr M. Gevao, Andreas Gnirke, Andrew Rambaut, Robert F. Garry, S. Humarr Khan, Pardis C. Sabeti

Materials/Methods, Supporting Text, Tables, Figures, and/or References

Additional Data

Table S1
Overview of the different sequencing methods used for the first batch of EBOV
EBOV samples from batch 1 were sequenced using Illumina (three different
library preps) and PacBio (one library prep). The samples for Illumina Nextera and
Illumina Nugen are the same (n=15), whereas the samples for Illumina Standard and
PacBio Nugen are subsets of these samples. The median depth of coverage including
range, as well as the mean percent coverage are shown.
Table S2
Summary of sequence data produced in this study. Sample information and sequencing
statistics for all 99 samples prepared using the Nextera library preparation method.
EBOV copies/ml of serum was determined using qPCR (see Material and Methods
above). The dates correspond to the date that the sample was tested at the KGH Laboratory.
Table S3
Primer Comparison. Twelve primer sets from eleven published assays (7, 8, 21, 25, 41-44)
and the KGH primer set comprising both EBOV-specific and Pan-filovirus assays were
screened against the EBOV consensus from Sierra Leone sequences using Geneious R6.
Mapped primers and probes were compared to consensus sequence and nucleotide
discrepancies noted. These discrepancies are shown in red italics in the table. There
were a total of 9 nucleotide discrepancies in either the forward or reverse primer, or the
probe. It is unknown how these discrepancies affect sensitivity and specificity of each
assay. Further validation is needed, comparing these primer sets to the Guinea and Sierra
Leone EBOV variants by conventional and quantitative RT-PCR methods in order to
assess reaction kinetics and inform diagnostic suitability of the assays. Note that no
nucleotide discrepancies were seen in primer sets with degenerate bases. This may
constitute a good strategy in future assay design.
Table S4
SNPs unique to the 2014 outbreak variant. Table of SNPs unique to the 2014 outbreak.
The amino acid position, reference and alternate amino acids, BLOSUM62 score for
nonsynonymous substitutions, count of sequences in the outbreak clade carrying the
variant, and conservation across all ebolaviruses are given. We also provide a list of
amino acid sites that are polymorphic or unique in any EVD outbreak (1976-7,
1994/1996, 1995 2002, 2007-8, 2014) and otherwise completely conserved across all
ebolaviruses, as well as a list of amino acid sites that are polymorphic or unique in any
EVD outbreak (1976/1977, 1994/1996, 1995, 2002, 2007/2008, 2014), have a non-
conservative substitution (BLOSUM62 score < 0) between the reference and alternate
amino acids, and otherwise have only conservative amino acid substitutions across all
ebolaviruses. We also provide tables of amino acid differences in GP between the 2014
outbreak variants and the Mayinga (Genbank accession number NC002549) and Kikwit
(Genbank accession number JQ352763) variants.
File S1
Alignment and SNP calls used in this study.
File S2
Phylogenetic trees created using MrBayes and RAxML.
File S3
BEAST XML files used to estimate the divergence time for the 2014 EBOV lineages, as well
as for all EBOV isolates.
File S4
Intrahost variants for 78 Sierra Leone EVD patients. iSNVs are described in annotated VCF
format. Tabular text formats are provided for a subset of analyses described in this paper.