NDLI logo
  • Content
  • Similar Resources
  • Metadata
  • Cite This
  • Log-in
  • Fullscreen
Log-in
Do not have an account? Register Now
Forgot your password? Account recovery
  1. Proceedings of the The 12th International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE 2016)
  2. Search Based Training Data Selection For Cross Project Defect Prediction
Loading...

Please wait, while we are loading the content...

Forecasting Communication Behavior in Student Software Projects
Estimating Story Points from Issue Reports
Search Based Training Data Selection For Cross Project Defect Prediction
On the Terms Within- and Cross-Company in Software Effort Estimation
Measuring the Stylistic Inconsistency in Software Projects using Hierarchical Agglomerative Clustering
An Empirical Evaluation of Distribution-based Thresholds for Internal Software Measures
Data Sets and Data Quality in Software Engineering: Eight Years On
Hidden Markov Models for the Prediction of Developer Involvement Dynamics and Workload
Predicting the Popularity of GitHub Repositories
A Manually Validated Code Refactoring Dataset and Its Assessment Regarding Software Maintainability

Similar Documents

...
Training data selection for cross-project defect prediction

Article

...
Learning from Open-Source Projects: An Empirical Study on Defect Prediction

Article

...
Multi-objective Cross-Project Defect Prediction

Article

...
Improving Prediction Robustness of VAB-SVM for Cross-Project Defect Prediction

Article

...
Feature and instance selection via cooperative PSO

Article

...
An Empirical Study of Training Data Selection Methods for Ranking-Oriented Cross-Project Defect Prediction

Article

...
A Search-based Training Algorithm for Cost-aware Defect Prediction

Article

...
An Empirical Study of Classifier Combination for Cross-Project Defect Prediction

Article

...
An instance selection and optimization method for multiple instance learning

Article

Search Based Training Data Selection For Cross Project Defect Prediction

Content Provider ACM Digital Library
Author Turhan, Burak Mäntylä, Mika Hosseini, Seyedrebvar
Abstract Context: Previous studies have shown that steered training data or dataset selection can lead to better performance for cross project defect prediction (CPDP). On the other hand, data quality is an issue to consider in CPDP. Aim: We aim at utilising the Nearest Neighbor (NN)-Filter, embedded in a genetic algorithm, for generating evolving training datasets to tackle CPDP, while accounting for potential noise in defect labels. Method: We propose a new search based training data (i.e., instance) selection approach for CPDP called GIS (Genetic Instance Selection) that looks for solutions to optimize a combined measure of F-Measure and GMean, on a validation set generated by (NN)-filter. The genetic operations consider the similarities in features and address possible noise in assigned defect labels. We use 13 datasets from PROMISE repository in order to compare the performance of GIS with benchmark CPDP methods, namely (NN)-filter and naive CPDP, as well as with within project defect prediction (WPDP). Results: Our results show that GIS is significantly better than (NN)-Filter in terms of F-Measure (p -- value ≪ 0.001, Cohen's d = 0.697) and GMean (p -- value ≪ 0.001, Cohen's d = 0.946). It also outperforms the naive CPDP approach in terms of F-Measure (p -- value ≪ 0.001, Cohen's d = 0.753) and GMean (p -- value ≪ 0.001, Cohen's d = 0.994). In addition, the performance of our approach is better than that of WPDP, again considering F-Measure (p -- value ≪ 0.001, Cohen's d = 0.227) and GMean (p -- value ≪ 0.001, Cohen's d = 0.595) values. Conclusions: We conclude that search based instance selection is a promising way to tackle CPDP. Especially, the performance comparison with the within project scenario encourages further investigation of our approach. However, the performance of GIS is based on high recall in the expense of low precision. Using different optimization goals, e.g. targeting high precision, would be a future direction to investigate.
Starting Page 1
Ending Page 10
Page Count 10
File Format PDF
ISBN 9781450347723
DOI 10.1145/2972958.2972964
Language English
Publisher Association for Computing Machinery (ACM)
Publisher Date 2016-09-09
Publisher Place New York
Access Restriction Subscribed
Subject Keyword Instance selection Cross project defect prediction Search based optimization Training data selection Genetic algorithms
Content Type Text
Resource Type Article
  • About
  • Disclaimer
  • Feedback
  • Sponsor
  • Contact
  • Chat with Us
About National Digital Library of India (NDLI)
NDLI logo

National Digital Library of India (NDLI) is a virtual repository of learning resources which is not just a repository with search/browse facilities but provides a host of services for the learner community. It is sponsored and mentored by Ministry of Education, Government of India, through its National Mission on Education through Information and Communication Technology (NMEICT). Filtered and federated searching is employed to facilitate focused searching so that learners can find the right resource with least effort and in minimum time. NDLI provides user group-specific services such as Examination Preparatory for School and College students and job aspirants. Services for Researchers and general learners are also provided. NDLI is designed to hold content of any language and provides interface support for 10 most widely used Indian languages. It is built to provide support for all academic levels including researchers and life-long learners, all disciplines, all popular forms of access devices and differently-abled learners. It is designed to enable people to learn and prepare from best practices from all over the world and to facilitate researchers to perform inter-linked exploration from multiple sources. It is developed, operated and maintained from Indian Institute of Technology Kharagpur.

Learn more about this project from here.

Disclaimer

NDLI is a conglomeration of freely available or institutionally contributed or donated or publisher managed contents. Almost all these contents are hosted and accessed from respective sources. The responsibility for authenticity, relevance, completeness, accuracy, reliability and suitability of these contents rests with the respective organization and NDLI has no responsibility or liability for these. Every effort is made to keep the NDLI portal up and running smoothly unless there are some unavoidable technical issues.

Feedback

Sponsor

Ministry of Education, through its National Mission on Education through Information and Communication Technology (NMEICT), has sponsored and funded the National Digital Library of India (NDLI) project.

Contact National Digital Library of India
Central Library (ISO-9001:2015 Certified)
Indian Institute of Technology Kharagpur
Kharagpur, West Bengal, India | PIN - 721302
See location in the Map
03222 282435
Mail: support@ndl.gov.in
Sl. Authority Responsibilities Communication Details
1 Ministry of Education (GoI),
Department of Higher Education
Sanctioning Authority https://www.education.gov.in/ict-initiatives
2 Indian Institute of Technology Kharagpur Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project https://www.iitkgp.ac.in
3 National Digital Library of India Office, Indian Institute of Technology Kharagpur The administrative and infrastructural headquarters of the project Dr. B. Sutradhar  bsutra@ndl.gov.in
4 Project PI / Joint PI Principal Investigator and Joint Principal Investigators of the project Dr. B. Sutradhar  bsutra@ndl.gov.in
Prof. Saswat Chakrabarti  will be added soon
5 Website/Portal (Helpdesk) Queries regarding NDLI and its services support@ndl.gov.in
6 Contents and Copyright Issues Queries related to content curation and copyright issues content@ndl.gov.in
7 National Digital Library of India Club (NDLI Club) Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach clubsupport@ndl.gov.in
8 Digital Preservation Centre (DPC) Assistance with digitizing and archiving copyright-free printed books dpc@ndl.gov.in
9 IDR Setup or Support Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops idr@ndl.gov.in
I will try my best to help you...
Cite this Content
Loading...