NDLI: Attribute Clustering for Grouping, Selection, and Classification of Gene Expression Data

Please wait, while we are loading the content...

Attribute Clustering for Grouping, Selection, and Classification of Gene Expression Data

Content Provider	ACM Digital Library
Author	Chan, Keith C. C. Wang, Yang Au, Wai-Ho Wong, Andrew K. C.
Abstract	This paper presents an attribute clustering method which is able to group genes based on their interdependence so as to mine meaningful patterns from the gene expression data. It can be used for gene grouping, selection, and classification. The partitioning of a relational table into attribute subgroups allows a small number of attributes within or across the groups to be selected for analysis. By clustering attributes, the search dimension of a data mining algorithm is reduced. The reduction of search dimension is especially important to data mining in gene expression data because such data typically consist of a huge number of genes (attributes) and a small number of gene expression profiles (tuples). Most data mining algorithms are typically developed and optimized to scale to the number of tuples instead of the number of attributes. The situation becomes even worse when the number of attributes overwhelms the number of tuples, in which case, the likelihood of reporting patterns that are actually irrelevant due to chances becomes rather high. It is for the aforementioned reasons that gene grouping and selection are important preprocessing steps for many data mining algorithms to be effective when applied to gene expression data. This paper defines the problem of attribute clustering and introduces a methodology to solving it. Our proposed method groups interdependent attributes into clusters by optimizing a criterion function derived from an information measure that reflects the interdependence between attributes. By applying our algorithm to gene expression data, meaningful clusters of genes are discovered. The grouping of genes based on attribute interdependence within group helps to capture different aspects of gene association patterns in each group. Significant genes selected from each group then contain useful information for gene expression classification and identification. To evaluate the performance of the proposed approach, we applied it to two well-known gene expression data sets and compared our results with those obtained by other methods. Our experiments show that the proposed method is able to find the meaningful clusters of genes. By selecting a subset of genes which have high multiple-interdependence with others within clusters, significant classification information can be obtained. Thus, a small pool of selected genes can be used to build classifiers with very high classification rate. From the pool, gene expressions of different categories can be identified.
Starting Page	83
Ending Page	101
Page Count	19
File Format	PDF
ISSN	15455963
DOI	10.1109/TCBB.2005.17
Volume Number	2
Issue Number	2
Journal	IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Language	English
Publisher	Association for Computing Machinery (ACM)
Publisher Date	2005-04-01
Access Restriction	One Nation One Subscription (ONOS)
Subject Keyword	Data mining, attribute clustering, gene selection, gene expression classification, microarray analysis.
Content Type	Text
Resource Type	Article
Subject	Genetics Biotechnology Applied Mathematics

Sl.	Authority	Responsibilities	Communication Details
1	Ministry of Education (GoI), Department of Higher Education	Sanctioning Authority	https://www.education.gov.in/ict-initiatives
2	Indian Institute of Technology Kharagpur	Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project	https://www.iitkgp.ac.in
3	National Digital Library of India Office, Indian Institute of Technology Kharagpur	The administrative and infrastructural headquarters of the project	Dr. B. Sutradhar bsutra@ndl.gov.in
4	Project PI / Joint PI	Principal Investigator and Joint Principal Investigators of the project	Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon
5	Website/Portal (Helpdesk)	Queries regarding NDLI and its services	support@ndl.gov.in
6	Contents and Copyright Issues	Queries related to content curation and copyright issues	content@ndl.gov.in
7	National Digital Library of India Club (NDLI Club)	Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach	clubsupport@ndl.gov.in
8	Digital Preservation Centre (DPC)	Assistance with digitizing and archiving copyright-free printed books	dpc@ndl.gov.in
9	IDR Setup or Support	Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops	idr@ndl.gov.in

Attribute clustering for grouping, selection, and classification of gene expression data

Correction to "Attribute Clustering for Grouping, Selection, and Classification of Gene Expression Data"

Correction to Attribute Clustering for Grouping, Selection, and Classification of Gene Expression Data

Development of Two-Stage SVM-RFE Gene Selection Strategy for Microarray Expression Data Analysis

Stable Gene Selection from Microarray Data via Sample Weighting

Development of Two-Stage SVM-RFE Gene Selection Strategy for Microarray Expression Data Analysis

Significance of Gene Ranking for Classification of Microarray Samples

A Top-r Feature Selection Algorithm for Microarray Gene Expression Data

A Top-r Feature Selection Algorithm for Microarray Gene Expression Data

Attribute Clustering for Grouping, Selection, and Classification of Gene Expression Data

Similar Documents

Attribute clustering for grouping, selection, and classification of gene expression data

Correction to "Attribute Clustering for Grouping, Selection, and Classification of Gene Expression Data"

Correction to Attribute Clustering for Grouping, Selection, and Classification of Gene Expression Data

Development of Two-Stage SVM-RFE Gene Selection Strategy for Microarray Expression Data Analysis

Stable Gene Selection from Microarray Data via Sample Weighting

Development of Two-Stage SVM-RFE Gene Selection Strategy for Microarray Expression Data Analysis

Significance of Gene Ranking for Classification of Microarray Samples

A Top-r Feature Selection Algorithm for Microarray Gene Expression Data

A Top-r Feature Selection Algorithm for Microarray Gene Expression Data

Attribute Clustering for Grouping, Selection, and Classification of Gene Expression Data