Loading...
Please wait, while we are loading the content...
Similar Documents
1Clustering Relational Database Entities using K-means
| Content Provider | CiteSeerX |
|---|---|
| Author | Bourennani, Farid Guennoun, Mouhcine Zhu, Ying |
| Abstract | Abstract—The fast evolution of hardware and the inter-net made large volumes of data more accessible. This data is composed of heterogeneous data types such as text, num-bers, multimedia, and others. Non-overlapping research communities work on processing homogeneous data types. Nevertheless, from the user perspective, these heteroge-neous data types should behave and be accessed in a sim-ilar fashion. Processing heterogeneous data types, which is Heterogeneous Data Mining (HDM), is a complex task. However, the HDM by Unified Vectorization (HDM-UV) seems to be an appropriate solution for this problem be-cause it permits to process the heterogeneous data types simultaneously. In this paper, we use K-means and Self-Organizing Maps for simultaneously processing textual and numerical data types by UV. We evaluate how the HDM-UV improves the clustering results of these two algorithms (SOM, K-means) by comparing them to the traditional ho-mogeneous data processing. Furthermore, we compare the clustering results of the two algorithms applied to a data integration problem. |
| File Format | |
| Access Restriction | Open |
| Subject Keyword | Heterogeneous Data Type Relational Database Entity Self-organizing Map Numerical Data Type Non-overlapping Research Community User Perspective Unified Vectorization Large Volume Appropriate Solution Heterogeneous Data Mining Data Integration Problem Fast Evolution Homogeneous Data Type Complex Task Sim-ilar Fashion Heteroge-neous Data Type Problem Be-cause Traditional Ho-mogeneous Data Processing |
| Content Type | Text |