CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

A New Order Estimator for Fixed and Variable Length Markov Models with Applications to DNA Sequence Similarity

Daniel Dalevi (Institutionen för data- och informationsteknik, Datavetenskap, Bioinformatik (Chalmers)) ; Devdatt Dubhashi (Institutionen för data- och informationsteknik, Datavetenskap (Chalmers)) ; Malte Hermansson
Statistical Applications in Genetics and Molecular Biology (1544-6115). Vol. 5 (2006), 1, p. i-24.
[Artikel, refereegranskad vetenskaplig]

Recently Peres and Shields discovered a new method for estimating the order of a stationary fixed order Markov chain. They showed that the estimator is consistent by proving a threshold result. While this threshold is valid asymptotically in the limit, it is not very useful for DNA sequence analysis where data sizes are moderate. In this paper we give a novel interpretation of the Peres-Shields estimator as a sharp transition phenomenon. This yields a precise and powerful estimator that quickly identifies the core dependencies in data. We show that it compares favorably to other estimators, especially in the presence of variable dependencies. Motivated by this last point, we extend the Peres-Shields estimator to Variable Length Markov Chains. We compare it to a well-established estimator and show that it is superior in terms of the predictive likelihood. We give an application to the problem of detecting DNA sequence similarity in plasmids. Copyright ©2006 The Berkeley Electronic Press. All rights reserved.

Nyckelord: computational biology/bioinformatics, statistical models, statistical theory and methods


Article 8.



Denna post skapades 2008-01-08. Senast ändrad 2016-06-17.
CPL Pubid: 64806

 

Läs direkt!


Länk till annan sajt (kan kräva inloggning)


Institutioner (Chalmers)

Institutionen för data- och informationsteknik, Datavetenskap, Bioinformatik (Chalmers)
Institutionen för data- och informationsteknik, Datavetenskap (Chalmers)
Institutionen för cell- och molekylärbiologi (1994-2011)

Ämnesområden

Bioinformatik och systembiologi

Chalmers infrastruktur