Skip to main content

Supervised machine learning reveals introgressed loci in the genomes of Drosophila simulans and D. sechellia

Author(s): Schrider, Daniel R.; Ayroles, Julien F.; Matute, Daniel R.; Kern, Andrew D.

Download
To refer to this page use: http://arks.princeton.edu/ark:/88435/pr1x13n
Full metadata record
DC FieldValueLanguage
dc.contributor.authorSchrider, Daniel R.-
dc.contributor.authorAyroles, Julien F.-
dc.contributor.authorMatute, Daniel R.-
dc.contributor.authorKern, Andrew D.-
dc.date.accessioned2019-04-19T18:34:15Z-
dc.date.available2019-04-19T18:34:15Z-
dc.date.issued2018-04-23en_US
dc.identifier.citationSchrider, Daniel R., Ayroles, Julien F., Matute, Daniel R., Kern, Andrew D. (2018). Supervised machine learning reveals introgressed loci in the genomes of Drosophila simulans and D. sechellia. PLOS Genetics, 14 (4), e1007341 - e1007341. doi:10.1371/journal.pgen.1007341en_US
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/pr1x13n-
dc.description.abstractHybridization and gene flow between species appears to be common. Even though it is clear that hybridization is widespread across all surveyed taxonomic groups, the magnitude and consequences of introgression are still largely unknown. Thus it is crucial to develop the statistical machinery required to uncover which genomic regions have recently acquired haplotypes via introgression from a sister population. We developed a novel machine learning framework, called FILET (Finding Introgressed Loci via Extra-Trees) capable of revealing genomic introgression with far greater power than competing methods. FILET works by combining information from a number of population genetic summary statistics, including several new statistics that we introduce, that capture patterns of variation across two populations. We show that FILET is able to identify loci that have experienced gene flow between related species with high accuracy, and in most situations can correctly infer which population was the donor and which was the recipient. Here we describe a data set of outbred diploid Drosophila sechellia genomes, and combine them with data from D. simulans to examine recent introgression between these species using FILET. Although we find that these populations may have split more recently than previously appreciated, FILET confirms that there has indeed been appreciable recent introgression (some of which might have been adaptive) between these species, and reveals that this gene flow is primarily in the direction of D. simulans to D. sechellia.en_US
dc.format.extente1007341 - e1007341en_US
dc.language.isoen_USen_US
dc.relation.ispartofPLOS Geneticsen_US
dc.rightsFinal published version. This is an open access article.en_US
dc.titleSupervised machine learning reveals introgressed loci in the genomes of Drosophila simulans and D. sechelliaen_US
dc.typeJournal Articleen_US
dc.identifier.doidoi:10.1371/journal.pgen.1007341-
dc.date.eissued2018-04-23en_US
dc.identifier.eissn1553-7404-
pu.type.symplectichttp://www.symplectic.co.uk/publications/atom-terms/1.0/journal-articleen_US

Files in This Item:
File Description SizeFormat 
Supervised_machine_learning_reveals_2018.pdf3.29 MBAdobe PDFView/Download
Supp_Info_Supervised_machine_learning_2018.zip3.34 MBUnknownView/Download


Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.