Loading...
Please wait, while we are loading the content...
Estudo de Técnicas para Indexação e Recuperação de Sequências Numéricas: Segmentação Adaptativa e Processamento de Consultas em Lote
| Content Provider | Semantic Scholar |
|---|---|
| Author | Brito, Luiz F. A. |
| Copyright Year | 2018 |
| Abstract | Indexing structures and specialized search algorithms provide similarity queries. According to current literature, similarity queries should be fast and minimize the amount of space required. In this master’s thesis, we studied two approaches in order to meet these requirements in the context of numeric sequences. In the first approach, we proposed two representations to approximate sequences and to create lower bounding measures to the euclidian distance: Error-Bounded Piecewise Linear Approximation (EBPLA) and Adaptive Indexable Piecewise Linear Approximation (AIPLA). In an innovative way, these two representations stored a set of coefficients such that its size was proportionally to the characteristics of the sequences. In experiments, the EBPLA, although flexible, obtained high approximation error and, consequently, the efficiency of its lower bounding was lower than the other representations. The other proposed representation, the AIPLA, provided the lowest approximation error and its lower bounding was similar to well known representations such as Piecewise Aggregate Approximation (PAA) and Indexable Piecewise Linear Approximation (IPLA). In the second approach we grouped query sequences, sent as batches, in order to reduce the time of similarity queries. Firstly we formed groups of queries and then we searched through indexing structures, such as �-Trees and � -Trees, only once. In our experiments, we evaluated 5 different strategies to group sequences. The results indicate the overall best strategy for grouping queries, the one which saved more access to secondary memory, is the one that unifies all queries in a single group. However, this grouping strategy can considerably increase the usage of primary memory for large batches. Therefore, in scenarios where primary memory is limited, we suggest the use of the strategy which creates � clusters from � initial sequences chosen randomly. |
| File Format | PDF HTM / HTML |
| DOI | 10.14393/ufu.di.2018.253 |
| Alternate Webpage(s) | https://repositorio.ufu.br/bitstream/123456789/21300/1/EstudoTecnicasIndexacao.pdf |
| Alternate Webpage(s) | https://doi.org/10.14393/ufu.di.2018.253 |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |