Loading...
Please wait, while we are loading the content...
Similar Documents
Approximate Nearest Neighbors
| Content Provider | Semantic Scholar |
|---|---|
| Author | Peled, Sariel Har |
| Copyright Year | 2006 |
| Abstract | The main topic of this chapter is the following general algorithmic problem, called nearest neighbor search: Given a set P of n points in some metric space (X, ρ), we would like to construct a data structure such that, for any point x ∈ X (called a query point), we can quickly find a nearest neighbor (with respect to the metric ρ) of x in P , i.e., a point p * ∈ P such that ρ(p * , x) ≤ ρ(p, x) for all p ∈ P. Of course, the point p * with this property need not be uniquely defined, i.e., there can be several nearest neighbors, but we do not let that deter us and denote any of them by NN(x, P) (resolving ties arbitrarily, if necessary). Obviously, there is a trivial solution to this task: Assuming that we have some way of computing the distance ρ(x, y) for any two points in X, we can simply store the input points P (in a list, say). Then, given a query point q, we can go through the list of points p ∈ P one by one, compute the distance ρ(p, q) for each of them. As we go along, we keep track of the minimum distance (and the minimizing point) computed so far. This procedure requires linear storage (assuming that we can store each point in constant space, otherwise there will be some correction factor) and a linear number of distance computations. This kind of problem arises in numerous application areas, such as information retrieval, machine learning, computer vision etc., to drop just a few keywords. For a detailed survey of nearest neighbor search in various settings , including many references, we refer the reader to the article by Indyk [4]. Typically, the data sets P that arise in the applications are very large (n could be over a billion), and it is not very desirable to have to go through the entire data set for each query. Instead, we would like to preprocess P into some data structure that allows us to anser each query in substantially sublin-ear time. Ideally, we would like to achieve a query time that is logarithmic or polylogarithmic in n, while keeping the data structure " small " , ideally of size that is near-linear or polynomial (of low degree) in n. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://graphics.stanford.edu/courses/cs468-06-fall/Slides/danielr.pdf |
| Alternate Webpage(s) | http://www.graphics.stanford.edu/courses/cs468-06-fall/Slides/danielr.pdf |
| Alternate Webpage(s) | https://graphics.stanford.edu/courses/cs468-06-fall/Slides/danielr.pdf |
| Alternate Webpage(s) | http://www.ti.inf.ethz.ch/ew/courses/ApproxGeom06/notes/ANN.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |