Loading...
Please wait, while we are loading the content...
Similar Documents
Sparse cluster analysis of large-scale discrete variables with application to single nucleotide polymorphism data
| Content Provider | Scilit |
|---|---|
| Author | Wu, Baolin |
| Copyright Year | 2013 |
| Description | Current extremely large scale genetic data presents significant challenges for cluster analysis. Most existing clustering methods are typically built on Euclidean distance and geared toward analyzing continuous response. They work well for clustering, e.g., microarray gene expression data, but often perform poorly for clustering, e.g., large scale single nucleotide polymorphism data. In this paper, we study the penalized latent class model for clustering extremely large scale discrete data. The penalized latent class model takes into account the discrete nature of the response using appropriate generalized linear models and adopts the lasso penalized likelihood approach for simultaneous model estimation and selection of important covariates. We develop very efficient numerical algorithms for model estimation based on the iterative coordinate descent approach and further develop the Expectation-Maximization algorithm to incorporate and model missing values. We use simulation studies and applications to the international HapMap single nucleotide polymorphism data to illustrate the competitive performance of the penalized latent class model. |
| Related Links | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3601766/pdf |
| Ending Page | 367 |
| Page Count | 10 |
| Starting Page | 358 |
| ISSN | 02664763 |
| e-ISSN | 13600532 |
| DOI | 10.1080/02664763.2012.743977 |
| Journal | Journal of Applied Statistics |
| Issue Number | 2 |
| Volume Number | 40 |
| Language | English |
| Publisher | Informa UK Limited |
| Publisher Date | 2013-02-01 |
| Access Restriction | Open |
| Subject Keyword | Journal: Journal of Applied Statistics Statistics and Probability Clustering Expectation-maximization Algorithm Latent Class Model Lasso K-means Principal Components Single Nucleotide Polymorphism Sparse Clustering |
| Content Type | Text |
| Resource Type | Article |
| Subject | Statistics and Probability Statistics, Probability and Uncertainty |