CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

Segmenting strings homogeneously via trees

Peter Damaschke (Institutionen för data- och informationsteknik, Datavetenskap, Algoritmer (Chalmers))
33rd International Workshop on Graph-Theoretic Concepts in Computer Science WG 2007, Lecture Notes in Computer Science Vol. 4769 (2007), p. 214-225.
[Konferensbidrag, refereegranskat]

We divide a string into k segments, each with only one sort of symbols, so as to minimize the total number of exceptions. Motivations come from machine learning and data mining. For binary strings we develop a linear-time algorithm for any k. Key to efficiency is a special-purpose data structure, called W-tree, which reflects relations between repetition lengths of symbols. Existence of algorithms faster than obvious dynamic programming remains open for non-binary strings. Our problem is also equivalent to finding weighted independent sets of prescribed size in paths. We show that this problem in bounded-degree graphs is FPT.

Nyckelord: segmentation, dynamic programming, tree computations, weighted independent set, interval graphs, parameterized complexity



Denna post skapades 2007-12-12. Senast ändrad 2015-02-11.
CPL Pubid: 62850