Research Article

Whole-organism lineage tracing by combinatorial and cumulative genome editing

See allHide authors and affiliations

Science  29 Jul 2016:
Vol. 353, Issue 6298, aaf7907
DOI: 10.1126/science.aaf7907

Structured Abstract


The developmental path by which a fertilized egg gives rise to the cells of a multicellular organism is termed the cell lineage. In 1983, John Sulston and colleagues documented the invariant cell lineage of the roundworm Caenorhabditis elegans as determined by visual observation. However, tracing cell lineage in nearly all other multicellular organisms is vastly more challenging. Contemporary methods rely on genetic markers or somatic mutations, but these approaches have limitations that preclude their application at the level of a whole, complex organism.


For a technology to comprehensively trace cell lineages in a complex multicellular system, it must uniquely and incrementally mark cells and their descendants over many divisions and in a way that does not interfere with normal development. These unique marks must also accumulate irreversibly over time, allowing the reconstruction of lineage trees. Finally, the full set of marks must be read out from each of many single cells. We hypothesized that genome editing, which introduces diverse, irreversible edits in a highly programmable fashion, could be repurposed for cell lineage tracing in a way that realizes these characteristics.

To this end, we developed a method termed genome editing of synthetic target arrays for lineage tracing (GESTALT). This method uses genome editing to generate a combinatorial diversity of mutations that accumulate over many cell divisions within a compact DNA barcode consisting of multiple clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 target sites. Lineage relationships can be readily queried by sequencing the edited barcodes and relating the patterns of edits observed.


We first developed this approach in cell culture, editing synthetic arrays of 9 to 12 CRISPR/Cas9 target sites to generate thousands of unique derivative barcodes. We show that edited barcodes can be read by targeted sequencing of either DNA or RNA. In addition, the rates and patterns of barcode editing are tunable and the diverse edits accumulate over successive divisions in a way that is informative of cell lineage.

We then applied GESTALT to the zebrafish Danio rerio by injecting fertilized eggs with editing reagents that target a genomic barcode bearing 10 target sites. Across dozens of embryos, we demonstrate the accumulation of hundreds to thousands of uniquely edited barcodes per animal, from which lineage relationships can be inferred on the basis of shared mutations. In adult zebrafish, we evaluated the edited barcodes from ~200,000 cells and observed that the majority of cells in each organ are derived from a small number of progenitor cells. Furthermore, ancestral progenitors, inferred on the basis of shared mutations among subsets of cells, can contribute to different germ layers and organ systems.


Our proof-of-principle experiments show that combinatorial, cumulative genome editing of a compact barcode can be used to record lineage information in multicellular systems. Further optimization of GESTALT will enable mapping of the complete cell lineage in diverse organisms. This method could also be adapted to link cell lineage information to molecular profiles of the same cells. In the long term, we envision that rich, systematically generated maps of organismal development—wherein lineage, epigenetic, transcriptional, and positional information are concurrently captured at single-cell resolution—will advance our understanding of development in both healthy and disease states. More broadly, cumulative and combinatorial genome editing could stably record other types of biological information and history in living cells.


(Left) A barcode of CRISPR/Cas9 target sites is progressively edited over many cell divisions. (Right) Edited barcode sequences are related to one another on the basis of shared mutations in order to reconstruct lineage trees.


Multicellular systems develop from single cells through distinct lineages. However, current lineage-tracing approaches scale poorly to whole, complex organisms. Here, we use genome editing to progressively introduce and accumulate diverse mutations in a DNA barcode over multiple rounds of cell division. The barcode, an array of clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 target sites, marks cells and enables the elucidation of lineage relationships via the patterns of mutations shared between cells. In cell culture and zebrafish, we show that rates and patterns of editing are tunable and that thousands of lineage-informative barcode alleles can be generated. By sampling hundreds of thousands of cells from individual zebrafish, we find that most cells in adult organs derive from relatively few embryonic progenitors. In future analyses, genome editing of synthetic target arrays for lineage tracing (GESTALT) can be used to generate large-scale maps of cell lineage in multicellular systems for normal development and disease.

View Full Text