Parallel SuperFine—A tool for fast and accurate supertree estimation: Features and limitations

Computing evolutionary relationships on data sets containing hundreds to thousands of taxa easily becomes a daunting task. With recent advances in next-generation sequencing technologies, biological data sets are growing at an unprecedented pace. This fact turns much harder, either in terms of compl...

ver descrição completa

Detalhes bibliográficos
Autor principal: Neves, Diogo Telmo (author)
Outros Autores: Sobral, João Luís Ferreira (author)
Formato: article
Idioma:eng
Publicado em: 2017
Assuntos:
Texto completo:http://hdl.handle.net/1822/53188
País:Portugal
Oai:oai:repositorium.sdum.uminho.pt:1822/53188
Descrição
Resumo:Computing evolutionary relationships on data sets containing hundreds to thousands of taxa easily becomes a daunting task. With recent advances in next-generation sequencing technologies, biological data sets are growing at an unprecedented pace. This fact turns much harder, either in terms of complexity or scale, to conduct analyses over such large data sets. Therefore, phylogenetics requires new algorithms, methods, and tools to take advantage of parallel hardware and to be able to handle the unprecedented growth of biological data. In this paper, we present Parallel SuperFine–a tool for fast and accurate supertree estimation–and its features. Parallel SuperFine was derived from SuperFine—a state-of-the-art supertree (meta)method. We describe an extension made to SuperFine, which allows to improve significantly its performance, and how the EPIC framework is used to boost the overall performance of Parallel SuperFine. Additionally, we pinpoint current limitations that impair to attain (even) a better performance. Our studies reveal that Parallel SuperFine allows to reduce, significantly, the time required to perform supertree estimation. Moreover, we show that Parallel SuperFine exhibits good scalability, even in the presence of asymmetric biological data sets. Furthermore, the achieved results enable to conclude that the radical improvement in performance does not impair tree accuracy, which is a key issue in phylogenetic inference.