Loading...
Please wait, while we are loading the content...
Similar Documents
COMOVI: um framework para transformação de dados em aplicações de credit behavior scoring baseado no desenvolvimento dirigido por modelos
| Content Provider | Semantic Scholar |
|---|---|
| Author | Neto, OlL I. V. E. I. R. A. De, Rosalvo Ferreira |
| Copyright Year | 2015 |
| Abstract | The pre-processing stage in knowledge discovery projects is costly, generally taking between 50 and 80% of total project time. It is in this stage that data in a relational database are transformed for applying a data mining technique. This stage is a complex task that demands from database designers a strong interaction with experts who have a broad knowledge about the application domain. The frameworks that aim to systemize the data transformation stage have significant limitations when applied to behavior solutions such as the Credit Behavior Scoring solutions. Their goal is help financial institutions to decide whether to grant credit to consumers based on the credit risk of their requests. This work proposes a framework based on the Model Driven Development to systemize this stage in Credit Behavioral Scoring solutions. It is composed by a meta-model which maps the domain concepts and a set of transformation rules. This work has three main contributions: 1) improving the discriminant power of data mining techniques by means of the construction of new input variables, which embed new knowledge for the technique; 2) reducing the time of data transformation using automatic code generation and 3) allowing artificial intelligence and statistics modelers to perform the data transformation without the help of database experts. In order to validate the proposed framework, two comparative studies were conducted. First, a comparative study of performance between the main existing frameworks found in literature and the proposed framework applied to two databases was performed. One database from a known benchmark of an international competition organized by PKDD, and another one obtained from one of the biggest retail companies from Brazil, that has its own private label credit card. The RelAggs and Correlation-based Multiple View Validation frameworks were chosen as representatives of the propositional and relational data mining approaches, respectively. The comparison was carried out through by a 10-fold stratified cross-validation process with ten stratified parts in order to define the confidence intervals. The results show that the proposed framework delivers a performance equivalent or superior to those of existing frameworks, for the evaluation of performance measured by the area under the ROC curve, using a Multilayer Perceptron neural network, k-nearest neighbors and Random Forest as classifiers, with a confidence level of 95%. The second comparative study verified the reduction of time required for data transformation using the proposed framework. For this, seven teams composed by students from a Brazilian university measured the runtime of this stage with and without the proposed framework. The paired Wilcoxon Signed-Rank’s Test showed that the proposed framework reduces the time of data transformation with a confidence level of 95%. |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | https://repositorio.ufpe.br/bitstream/123456789/17330/1/Tese_Rosalvo_Neto_CIN_2015.pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |