Skip to main content

Inference of Ancestral Recombination Graphs through Topological Data Analysis

Author(s): Cámara, Pablo G.; Levine, Arnold J.; Rabadán, Raúl

To refer to this page use:
Abstract: The recent explosion of genomic data has underscored the need for interpretable and comprehensive analyses that can capture complex phylogenetic relationships within and across species. Recombination, reassortment and horizontal gene transfer constitute examples of pervasive biological phenomena that cannot be captured by tree-like representations. Starting from hundreds of genomes, we are interested in the reconstruction of potential evolutionary histories leading to the observed data. Ancestral recombination graphs represent potential histories that explicitly accommodate recombination and mutation events across orthologous genomes. However, they are computationally costly to reconstruct, usually being infeasible for more than few tens of genomes. Recently, Topological Data Analysis(TDA) methods have been proposed as robust and scalable methods that can capture the genetic scale and frequency of recombination. We build upon previous TDA developments for detecting and quantifying recombination, and present a novel framework that can be applied to hundreds of genomes and can be interpreted in terms of minimal histories of mutation and recombination events, quantifying the scales and identifying the genomic locations of recombinations. We implement this framework in a software package, called TARGet, and apply it to several examples, including small migration between different populations, human recombination, and horizontal evolution in finches inhabiting the Galápagos Islands.
Publication Date: 17-Aug-2016
Electronic Publication Date: 17-Aug-2016
Citation: Cámara, Pablo G, Levine, Arnold J, Rabadán, Raúl. (2016). Inference of Ancestral Recombination Graphs through Topological Data Analysis. PLOS Computational Biology, 12 (8), e1005071 - e1005071. doi:10.1371/journal.pcbi.1005071
DOI: doi:10.1371/journal.pcbi.1005071
EISSN: 1553-7358
Pages: e1005071 - e1005071
Type of Material: Journal Article
Journal/Proceeding Title: PLOS Computational Biology
Version: Final published version. This is an open access article.

Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.