CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

Incremental haplotype inference, phylogeny, and almost bipartite graphs

Peter Damaschke (Institutionen för datavetenskap, Bioinformatik ; Institutionen för datavetenskap, Algoritmer)
2nd RECOMB Satellite Workshop on Computational Methods for SNPs and Haplotypes, Proceedings (Prepint), Carnegie Mellon University, Pittsburgh 2004 p. 1-11. (2004)
[Konferensbidrag, refereegranskat]

The paper addresses the combinatorial problem of inferring the unknown haplotypes in a population, given a sample of genotypes, under the assumption that the population forms a perfect phylogeny (PP). It is important because physical haplotyping by DNA sequencing is expensive, whereas genotypes are easier to obtain. Since PPs appear naturally and quite frequently, PP haplotyping is a favourable approach to reliable haplotype inference. Since Gusfield's paper from 2002, a few different algorithms have been proposed. Here we show that an extremely simple algorithm identifies, under the random mating assumption, all sufficiently frequent haplotypes in a random sample of genotypes of asymptotically optimal size. Missing data can also be recovered if they are not too prevalent. Moreover, the idea of the algorithm easily extends to more general population structures than PP. We also solve a problem we call almost 2-coloring of graphs, which arises in an enhanced version of our haplotyping algorithm. We show that the solution space of this graph problem can be computed in linear time.

Nyckelord: haplotype phasing, perfect phylogeny, random mating, bipartite graphs

Denna post skapades 2006-09-25. Senast ändrad 2015-02-11.
CPL Pubid: 1669


Institutioner (Chalmers)

Institutionen för datavetenskap, Bioinformatik (2002-2004)
Institutionen för datavetenskap, Algoritmer (2002-2004)


Information Technology

Chalmers infrastruktur