Loading...
Please wait, while we are loading the content...
Similar Documents
Managing Big Interval Data with CINTIA: the Checkpoint INTerval Array
| Content Provider | Semantic Scholar |
|---|---|
| Author | Mavlyutov, Ruslan Cudré-Mauroux, Philippe |
| Copyright Year | 2017 |
| Abstract | Intervals have become prominent in data management as they are the main data structure to represent a number of key data types such as temporal or genomic data. Yet, there exists no solution to compactly store and efficiently query big interval data. In this paper we introduce CINTIA—the Checkpoint INTerval Index Array—an efficient data structure to store and query interval data, which achieves high memory locality and outperforms state-of-the art solutions. We also propose a low-latency, Big Data system that implements CINTIA on top of a popular distributed file system and efficiently manages large interval data on clusters of commodity machines. Our system can easily be scaled-out and was designed to accommodate large delays between the various components of a distributed infrastructure. We experimentally evaluate the performance of our approach on several datasets and show that it outperforms current solutions by several orders of magnitude in distributed settings. |
| Starting Page | 1 |
| Ending Page | 1 |
| Page Count | 1 |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | https://exascale.info/assets/pdf/cintia_2017_tbd.pdf |
| Alternate Webpage(s) | https://doi.org/10.1109/TBDATA.2017.2691719 |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |