Loading...
Please wait, while we are loading the content...
Similar Documents
Classificação de dados em modelos com resposta binária via algoritmo boosting e regressão logística
| Content Provider | Semantic Scholar |
|---|---|
| Author | Liska, Gilberto Rodrigues Menezes, Fortunato S. De Cirillo, Marcelo Ângelo Vivanco, Mário J. F. |
| Copyright Year | 2013 |
| Abstract | Classify something is a natural human task, but there are situations where it is not best suited to perform this function. The need for automatic methods for classification arises in several areas, ranging from voice recognition, tumors recognition by x-ray films, email classification as spam or legitimate, among others. Due to the increasing complexity and importance of problems such as these, there is still a need for methods which provide greater accuracy and interpretability of the results. Among these methods Boosting, which operates sequentially applying a classification algorithm to reweighted versions of the training data set. Recently it was shown that Boosting may also be viewed as a method for estimating functional. Currently the logistic regression models with its parameters estimated by maximum likelihood (henceforth called LRMML) are very used to this kind of situation. In this sense, the present study was to compare the LRMML and Boosting algorithm, specifically Binomial Boosting algorithm (henceforth called LRMBB), logistic regression model, and select the model with the best fit and suitability of higher discrimination capacity in the situation of presence / absence of coronary heart disease (CHD) as a function of various biological variables in patients in order to provide the most accurate response to situations which is binary. To adjust the model, the data set was randomly partitioned into two subsets, one subset equivalent to 70 % of the original set (called training sample) and the remainder (called test set). The results show lower values of AIC and BIC for the LRMBB model compared to LRMML and the Hosmer-Lemeshow test shows both models (LRMLM and LRMBB) present no evidence of bad fit. The LRMBB model presented higher values of AUC, sensitivity, specificity and accuracy and lower values for the rate of false positives and false negatives, being therefore a model with better discrimination power in relation to the LRMML model. Observing the odds ratios, the LRMBB model showed more reliable results about the chance of a patient having CHD. Based on these results, the LRMBB model is best suited to describe the problem of presence / absence of coronary heart disease in patients because it provides more accurate information about the problem exposed. |
| File Format | PDF HTM / HTML |
| Volume Number | 1 |
| Alternate Webpage(s) | http://repositorio.ufla.br/bitstream/1/626/1/DISSERTA%C3%87%C3%83O%20Classifica%C3%A7%C3%A3o%20de%20dados%20em%20modelos%20com%20resposta%20bin%C3%A1ria%20via.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |