CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

Tentacle: distributed quantification of genes in metagenomes

Fredrik Boulund (Institutionen för matematiska vetenskaper, matematisk statistik) ; Anders Sjögren (Institutionen för matematiska vetenskaper, matematisk statistik) ; Erik Kristiansson (Institutionen för matematiska vetenskaper, matematisk statistik)
GigaScience (2047-217X). Vol. 4 (2015), p. artikel nr 40.
[Artikel, refereegranskad vetenskaplig]

Background In metagenomics, microbial communities are sequenced at increasingly high resolution, generating datasets with billions of DNA fragments. Novel methods that can efficiently process the growing volumes of sequence data are necessary for the accurate analysis and interpretation of existing and upcoming metagenomes. Findings Here we present Tentacle, which is a novel framework that uses distributed computational resources for gene quantification in metagenomes. Tentacle is implemented using a dynamic master-worker approach in which DNA fragments are streamed via a network and processed in parallel on worker nodes. Tentacle is modular, extensible, and comes with support for six commonly used sequence aligners. It is easy to adapt Tentacle to different applications in metagenomics and easy to integrate into existing workflows. Conclusions Evaluations show that Tentacle scales very well with increasing computing resources. We illustrate the versatility of Tentacle on three different use cases. Tentacle is written for Linux in Python 2.7 and is published as open source under the GNU General Public License (v3). Documentation, tutorials, installation instructions, and the source code are freely available online at: http://bioinformatics.math.chalmers.se/tentacle.

Nyckelord: Distributed computing, Master-worker, Next-generation sequencing, Metagenomics, Gene quantification, DNA sequence analysis, Read mapping, DNA sequencing

Den här publikationen ingår i följande styrkeområden:

Läs mer om Chalmers styrkeområden  

Denna post skapades 2015-11-09. Senast ändrad 2017-11-29.
CPL Pubid: 225489


Läs direkt!

Lokal fulltext (fritt tillgänglig)

Länk till annan sajt (kan kräva inloggning)

Institutioner (Chalmers)

Institutionen för matematiska vetenskaper, matematisk statistik (2005-2016)


Bioinformatik (beräkningsbiologi)

Chalmers infrastruktur

C3SE/SNIC (Chalmers Centre for Computational Science and Engineering)

Relaterade publikationer

Denna publikation ingår i:

Computational methods for analysis of fragmented sequence data