Loading...
Please wait, while we are loading the content...
Similar Documents
Computing clusters of correlation connected objects (2004).
| Content Provider | CiteSeerX |
|---|---|
| Author | Böhm, Christian Kailing, Karin Kröger, Peer Zimek, Arthur |
| Abstract | The detection of correlations between different features in a set of feature vectors is a very important data mining task because correlation indicates a dependency between the features or some association of cause and effect between them. This association can be arbitrarily complex, i.e. one or more features might be dependent from a combination of several other features. Well-known methods like the principal components analysis (PCA) can perfectly find correlations which are global, linear, not hidden in a set of noise vectors, and uniform, i.e. the same type of correlation is exhibited in all feature vectors. In many applications such as medical diagnosis, molecular biology, time sequences, or electronic commerce, however, correlations are not global since the dependency between features can be different in different subgroups of the set. In this paper, we propose a method called 4C (Computing Correlation Connected Clusters) to identify local subgroups of the data objects sharing a uniform but arbitrarily complex correlation. Our algorithm is based on a combination of PCA and density-based clustering (DBSCAN). Our method has a determinate result and is robust against noise. A broad comparative evaluation demonstrates the superior performance of 4C over competing methods such as DBSCAN, CLIQUE and ORCLUS. |
| File Format | |
| Publisher Date | 2004-01-01 |
| Access Restriction | Open |
| Subject Keyword | Correlation Connected Object Feature Vector Noise Vector Many Application Electronic Commerce Local Subgroup Principal Component Analysis Different Subgroup Data Object Density-based Clustering Correlation Connected Cluster Determinate Result Important Data Mining Task Molecular Biology Superior Performance Medical Diagnosis Several Feature Broad Comparative Evaluation Complex Correlation Well-known Method Time Sequence Different Feature |
| Content Type | Text |