CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

Analysis of large-scale metagenomic data

Fredrik Boulund (Institutionen för matematiska vetenskaper, matematisk statistik)
Göteborg : Chalmers University of Technology, 2013. - 64 s.

The topic of this thesis is the analysis of large data sets of DNA sequence data produced from modern high-throughput DNA sequencing machines. Using such machines to sequence the genetic content of a microbial community produces a metagenome. This thesis comprises three research papers, all connected to the study of large metagenomic data sets. In the first paper, we developed a method for discovering fragments of fluoroquinolone antibiotic resistance genes in short fragments of DNA. The method uses hidden Markov models for identifying qnr genes in short DNA fragments. Cross-validation showed that our method for classifying short fragments has high statistical power even for fragments as short as 100 base pairs, a length commonly encountered in modern next-generation sequencing data. In the second paper, the putative qnr genes identified in the first paper were verified using wet-lab experiments. This was a follow-up study to validate the findings from the first paper. An expression system for qnr genes in Escherichia coli hosts was developed and used to evaluate the resistance phenotype of the novel gene candidates discovered in the first paper. In the third paper, we developed an easy-to-use high performance method for distributed gene quantification in metagenomic sequence data. It leverages high-performance computing resources to provide high throughput while maintaining sensitivity. This enables efficient and accurate gene quantification, suitable for use in comparative metagenomics. Next-generation DNA sequencing has had a big impact on molecular biology. As the size of the produced data sets increases, there is an equally increasing need for methods suited for the analysis of such data sets. This thesis presents several new methods that are well adapted to analysis of modern terabase-sized metagenomic data sets.

Nyckelord: metagenomics, DNA analysis, big data, antibiotic resistance, hidden Markov models, high-performance computing, distributed computing

Den här publikationen ingår i följande styrkeområden:

Läs mer om Chalmers styrkeområden  

Denna post skapades 2013-09-24. Senast ändrad 2014-12-09.
CPL Pubid: 183892


Läs direkt!

Lokal fulltext (fritt tillgänglig)

Institutioner (Chalmers)

Institutionen för matematiska vetenskaper, matematisk statistik (2005-2016)


Hållbar utveckling
Bioinformatik (beräkningsbiologi)
Bioinformatik och systembiologi

Chalmers infrastruktur

C3SE/SNIC (Chalmers Centre for Computational Science and Engineering)

Relaterade publikationer

Inkluderade delarbeten:

A novel method to discover fluoroquinolone antibiotic resistance (qnr) genes in fragmented nucleotide sequences


Datum: 2013-10-18
Tid: 10:15
Lokal: Pascal, Matematiska Vetenskaper, Chalmers Tvärgata 3, Chalmers Tekniska Högskola, Göteborg
Opponent: Dr. Sean Hooper, The Institute of Cancer Research, London, England

Ingår i serie

Preprint - Department of Mathematical Sciences, Chalmers University of Technology and Göteborg University 2013:17