### Skapa referens, olika format (klipp och klistra)

**Harvard**

Damaschke, P. (2015) *Pairs covered by a sequence of sets*.

** BibTeX **

@conference{

Damaschke2015,

author={Damaschke, Peter},

title={Pairs covered by a sequence of sets},

booktitle={20th International Symposium on Fundamentals of Computation Theory FCT 2015, Lecture Notes in Computer Science},

isbn={978-3-319-22176-2},

pages={214-226},

abstract={Enumerating minimal new combinations of elements in a sequence of sets is interesting, e.g., for novelty detection in a stream of texts. The sets are the bags of words occuring in the texts. We focus on new pairs of elements as they are abundant. By simple data structures we can enumerate them in quadratic time, in the size of the sets, but large intersections with earlier sets rule
out all pairs therein in linear time. The challenge is to use this observation efficiently. We give a greedy heuristic based on the twin graph, a succinct description of the pairs covered by a set family, and on finding good candidate sets by random sampling. The heuristic is motivated and supported by several related complexity results: sample size estimates, hardness of maximal coverage of pairs, and approximation guarantees when a few sets cover almost all pairs. },

year={2015},

}

** RefWorks **

RT Conference Proceedings

SR Electronic

ID 220926

A1 Damaschke, Peter

T1 Pairs covered by a sequence of sets

YR 2015

T2 20th International Symposium on Fundamentals of Computation Theory FCT 2015, Lecture Notes in Computer Science

SN 978-3-319-22176-2

SP 214

OP 226

AB Enumerating minimal new combinations of elements in a sequence of sets is interesting, e.g., for novelty detection in a stream of texts. The sets are the bags of words occuring in the texts. We focus on new pairs of elements as they are abundant. By simple data structures we can enumerate them in quadratic time, in the size of the sets, but large intersections with earlier sets rule
out all pairs therein in linear time. The challenge is to use this observation efficiently. We give a greedy heuristic based on the twin graph, a succinct description of the pairs covered by a set family, and on finding good candidate sets by random sampling. The heuristic is motivated and supported by several related complexity results: sample size estimates, hardness of maximal coverage of pairs, and approximation guarantees when a few sets cover almost all pairs.

LA eng

DO 10.1007/978-3-319-22177-9_17

LK http://dx.doi.org/10.1007/978-3-319-22177-9_17

OL 30