NDLI: Density-based spatial clustering methods for very large datasets

Please wait, while we are loading the content...

Density-based spatial clustering methods for very large datasets

Content Provider	Semantic Scholar
Author	Wang, Xin
Copyright Year	2006
Abstract	Spatial data mining, or knowledge discovery in spatial databases, refers to the extraction from spatial databases of implicit knowledge, of spatial relations, or of other patterns that are not explicitly stored. Finding clusters in spatial data is an active research area in spatial data mining. The first part of this thesis proposes a novel density-based spatial clustering method called DBRS. The algorithm can identify clusters of widely varying shapes, clusters that depend on non-spatial attributes, and approximate clusters in very large databases. DBRS achieves these results by repeatedly picking an unclassified point at random and examining its neighborhood. If the neighborhood is sparsely populated or the purity of the points in the neighborhood is too low, the point is classified as noise. Otherwise, if any point in the neighborhood is part of a known cluster, this neighborhood is joined to that cluster. If neither of these two possibilities applies, a new cluster is begun with this neighborhood. The experimental results show that DBRS is not only efficient but also can produce high-quality clusters. The second part of this thesis develops a constraint-based spatial clustering algorithm dealing with constraints due to obstacles and facilitators. Typically, a clustering task consists of separating a set of objects into different groups according to a measure of goodness. A common measure of goodness is Euclidean distance (i.e. straight-line distance). However, in many applications, the use of Euclidean distance has a weakness because of the presence of obstacles and facilitators. An obstacle is a physical object that obstructs the reachability among the data objects, and a facilitator is also a physical object that connects distant data objects or connects data objects across obstacles. Handling these constraints can lead to effective and fruitful data mining by capturing application semantics. We extend DBRS to a new spatial clustering method, called DBRS+, which can handle any combination of intersecting obstacles and facilitators. DBRS+ is simple and efficient. Without any preprocessing, the constraints are handled during the clustering process. DBRS+ has been empirically evaluated using synthetic and real data sets. The third part of this thesis emphasizes that domain knowledge can play a key role in spatial clustering. We propose a framework called ONTO_CLUST to combine the domain ontology with the clustering algorithms. In the framework, we show that the clustering process should occur at the knowledge level so that users can identify their goals and understand the results. In ONTO_CLUST, the spatial clustering ontology component is used when identifying the clustering problem and the relevant data. Users' goals are used to search in the ontology. The results of these queries identify the proper clustering methods and the appropriate datasets. Based on these results, clustering is conducted. The clustering result can be used for statistical analysis or it can be interpreted using the ontology. The final result is returned to the user in an understandable format.
File Format	PDF HTM / HTML
Alternate Webpage(s)	https://freshtea.files.wordpress.com/2009/03/xinwangthesis.pdf
Language	English
Access Restriction	Open
Content Type	Text
Resource Type	Article

Central Library (ISO-9001:2015 Certified)
Indian Institute of Technology Kharagpur
Kharagpur, West Bengal, India | PIN - 721302

See location in the Map
03222 282435
Mail: support@ndl.gov.in

Sl.	Authority	Responsibilities	Communication Details
1	Ministry of Education (GoI), Department of Higher Education	Sanctioning Authority	https://www.education.gov.in/ict-initiatives
2	Indian Institute of Technology Kharagpur	Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project	https://www.iitkgp.ac.in
3	National Digital Library of India Office, Indian Institute of Technology Kharagpur	The administrative and infrastructural headquarters of the project	Dr. B. Sutradhar bsutra@ndl.gov.in
4	Project PI / Joint PI	Principal Investigator and Joint Principal Investigators of the project	Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon
5	Website/Portal (Helpdesk)	Queries regarding NDLI and its services	support@ndl.gov.in
6	Contents and Copyright Issues	Queries related to content curation and copyright issues	content@ndl.gov.in
7	National Digital Library of India Club (NDLI Club)	Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach	clubsupport@ndl.gov.in
8	Digital Preservation Centre (DPC)	Assistance with digitizing and archiving copyright-free printed books	dpc@ndl.gov.in
9	IDR Setup or Support	Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops	idr@ndl.gov.in