Loading...
Please wait, while we are loading the content...
Similar Documents
v 4 [ cs . D S ] 9 M ar 2 01 6 Space Lower Bounds for Itemset Frequency Sketches ∗ Edo Liberty
| Content Provider | Semantic Scholar |
|---|---|
| Author | Mitzenmacher, Michael Thaler, Justin Ullman, Jonathan |
| Copyright Year | 2018 |
| Abstract | Given a database, computing the fraction of rows that contai n a query itemset or determining whether this fraction is above some threshold are fundamental opera tions in data mining. A uniform sample of rows is a good sketch of the database in the sense that all su fficiently frequent itemsets and their approximate frequencies are recoverable from the sample, a nd the sketch size is independent of the number of rows in the original database. For many seemingly s imilar problems there are better sketching algorithms than uniform sampling. In this paper we show that for itemset frequency sketching this is not the case. That is, we prove that there exist classes of databa ses for which uniform sampling is a space optimal sketch for approximate itemset frequency analysis , up to constant or iterated-logarithmic factors. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://export.arxiv.org/pdf/1407.3740 |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |