CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

HattCI: Fast and Accurate attC site Identification Using Hidden Markov Models.

Mariana Buongermino Pereira (Institutionen för matematiska vetenskaper, matematisk statistik) ; Mikael Wallroth ; Erik Kristiansson (Institutionen för matematiska vetenskaper, Tillämpad matematik och statistik) ; Marina Axelson-Fisk (Institutionen för matematiska vetenskaper, matematisk statistik)
Journal of Computational Biology (1066-5277). Vol. 23 (2016), 11, p. 891-902.
[Artikel, refereegranskad vetenskaplig]

Integrons are genetic elements that facilitate the horizontal gene transfer in bacteria and are known to harbor genes associated with antibiotic resistance. The gene mobility in the integrons is governed by the presence of attC sites, which are 55 to 141-nucleotide-long imperfect inverted repeats. Here we present HattCI, a new method for fast and accurate identification of attC sites in large DNA data sets. The method is based on a generalized hidden Markov model that describes each core component of an attC site individually. Using twofold cross-validation experiments on a manually curated reference data set of 231 attC sites from class 1 and 2 integrons, HattCI showed high sensitivities of up to 91.9% while maintaining satisfactory false-positive rates. When applied to a metagenomic data set of 35 microbial communities from different environments, HattCI found a substantially higher number of attC sites in the samples that are known to contain more horizontally transferred elements. HattCI will significantly increase the ability to identify attC sites and thus integron-mediated genes in genomic and metagenomic data. HattCI is implemented in C and is freely available at http://bioinformatics.math.chalmers.se/HattCI .



Den här publikationen ingår i följande styrkeområden:

Läs mer om Chalmers styrkeområden  

Denna post skapades 2016-12-06. Senast ändrad 2017-07-03.
CPL Pubid: 245882

 

Läs direkt!


Länk till annan sajt (kan kräva inloggning)


Institutioner (Chalmers)

Institutionen för matematiska vetenskaper, matematisk statistik (2005-2016)
Institutionen för matematiska vetenskaper, Tillämpad matematik och statistikInstitutionen för matematiska vetenskaper, Tillämpad matematik och statistik (GU)

Ämnesområden

Livsvetenskaper
Hållbar utveckling
Matematisk statistik
Mikrobiologi

Chalmers infrastruktur

Relaterade publikationer

Denna publikation ingår i:


Statistical modelling and analyses of DNA sequence data with applications to metagenomics