CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

Fast algorithms for finding disjoint subsequences with extremal densities

Peter Damaschke (Institutionen för data- och informationsteknik, Datavetenskap, Algoritmer (Chalmers) ; Institutionen för data- och informationsteknik, Datavetenskap, Bioinformatik (Chalmers)) ; Anders Bergkvist
Pattern Recognition (0031-3203). Vol. 39 (2006), 12, p. 2281-2292.
[Artikel, refereegranskad vetenskaplig]

We derive fast algorithms for the following problem: Given a set of n points on the real line and two parameters s and p, find s disjoint intervals of maximum total length that contain at most p of the given points. Our main contribution consists of algorithms whose time bounds improve upon a straightforward dynamic programming algorithm, in the relevant case that input size n is much bigger than parameters s and p. These results are achieved by selecting a few candidate intervals that are provably sufficient for building an optimal solution via dynamic programming. As a byproduct of this idea we improve an algorithm for a similar subsequence problem of Chen, Lu and Tang (2005). The problems are motivated by the search for significant patterns in biological data. Finally we propose several heuristics that further reduce the time complexity in typical instances. One of them leads to an apparently open subsequence sum problem of independent interest.

Nyckelord: holes in data, range prediction, protein torsion angle, protein structure prediction, dynamic programming, selection algorithms, time complexity

Denna post skapades 2006-11-01. Senast ändrad 2015-02-11.
CPL Pubid: 23003