An Improved CNN Model for Within-Project Software Defect Prediction

Gespeichert in:

Bibliographische Detailangaben
Zeitschriftentitel:	Applied Sciences
Personen und Körperschaften:	Pan, Cong, Lu, Minyan, Xu, Biao, Gao, Houleng
In:	Applied Sciences, 9, 2019, 10, S. 2138
Format:	E-Article
Sprache:	Englisch
veröffentlicht:	MDPI AG
Schlagwörter:	Fluid Flow and Transfer Processes Computer Science Applications Process Chemistry and Technology General Engineering Instrumentation General Materials Science

author_facet	Pan, Cong Lu, Minyan Xu, Biao Gao, Houleng Pan, Cong Lu, Minyan Xu, Biao Gao, Houleng
author	Pan, Cong Lu, Minyan Xu, Biao Gao, Houleng
spellingShingle	Pan, Cong Lu, Minyan Xu, Biao Gao, Houleng Applied Sciences An Improved CNN Model for Within-Project Software Defect Prediction Fluid Flow and Transfer Processes Computer Science Applications Process Chemistry and Technology General Engineering Instrumentation General Materials Science
author_sort	pan, cong
spelling	Pan, Cong Lu, Minyan Xu, Biao Gao, Houleng 2076-3417 MDPI AG Fluid Flow and Transfer Processes Computer Science Applications Process Chemistry and Technology General Engineering Instrumentation General Materials Science http://dx.doi.org/10.3390/app9102138 <jats:p>To improve software reliability, software defect prediction is used to find software bugs and prioritize testing efforts. Recently, some researchers introduced deep learning models, such as the deep belief network (DBN) and the state-of-the-art convolutional neural network (CNN), and used automatically generated features extracted from abstract syntax trees (ASTs) and deep learning models to improve defect prediction performance. However, the research on the CNN model failed to reveal clear conclusions due to its limited dataset size, insufficiently repeated experiments, and outdated baseline selection. To solve these problems, we built the PROMISE Source Code (PSC) dataset to enlarge the original dataset in the CNN research, which we named the Simplified PROMISE Source Code (SPSC) dataset. Then, we proposed an improved CNN model for within-project defect prediction (WPDP) and compared our results to existing CNN results and an empirical study. Our experiment was based on a 30-repetition holdout validation and a 10 * 10 cross-validation. Experimental results showed that our improved CNN model was comparable to the existing CNN model, and it outperformed the state-of-the-art machine learning models significantly for WPDP. Furthermore, we defined hyperparameter instability and examined the threat and opportunity it presents for deep learning models on defect prediction.</jats:p> An Improved CNN Model for Within-Project Software Defect Prediction Applied Sciences
doi_str_mv	10.3390/app9102138
facet_avail	Online Free
finc_class_facet	Physik Informatik Chemie und Pharmazie Technik Allgemeines
format	ElectronicArticle
fullrecord	blob:ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMzM5MC9hcHA5MTAyMTM4
id	ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMzM5MC9hcHA5MTAyMTM4
institution	DE-Brt1 DE-Zwi2 DE-D161 DE-Zi4 DE-Gla1 DE-15 DE-Pl11 DE-Rs1 DE-14 DE-105 DE-Ch1 DE-L229 DE-D275 DE-Bn3
imprint	MDPI AG, 2019
imprint_str_mv	MDPI AG, 2019
issn	2076-3417
issn_str_mv	2076-3417
language	English
mega_collection	MDPI AG (CrossRef)
match_str	pan2019animprovedcnnmodelforwithinprojectsoftwaredefectprediction
publishDateSort	2019
publisher	MDPI AG
recordtype	ai
record_format	ai
series	Applied Sciences
source_id	49
title	An Improved CNN Model for Within-Project Software Defect Prediction
title_unstemmed	An Improved CNN Model for Within-Project Software Defect Prediction
title_full	An Improved CNN Model for Within-Project Software Defect Prediction
title_fullStr	An Improved CNN Model for Within-Project Software Defect Prediction
title_full_unstemmed	An Improved CNN Model for Within-Project Software Defect Prediction
title_short	An Improved CNN Model for Within-Project Software Defect Prediction
title_sort	an improved cnn model for within-project software defect prediction
topic	Fluid Flow and Transfer Processes Computer Science Applications Process Chemistry and Technology General Engineering Instrumentation General Materials Science
url	http://dx.doi.org/10.3390/app9102138
publishDate	2019
physical	2138
description	<jats:p>To improve software reliability, software defect prediction is used to find software bugs and prioritize testing efforts. Recently, some researchers introduced deep learning models, such as the deep belief network (DBN) and the state-of-the-art convolutional neural network (CNN), and used automatically generated features extracted from abstract syntax trees (ASTs) and deep learning models to improve defect prediction performance. However, the research on the CNN model failed to reveal clear conclusions due to its limited dataset size, insufficiently repeated experiments, and outdated baseline selection. To solve these problems, we built the PROMISE Source Code (PSC) dataset to enlarge the original dataset in the CNN research, which we named the Simplified PROMISE Source Code (SPSC) dataset. Then, we proposed an improved CNN model for within-project defect prediction (WPDP) and compared our results to existing CNN results and an empirical study. Our experiment was based on a 30-repetition holdout validation and a 10 * 10 cross-validation. Experimental results showed that our improved CNN model was comparable to the existing CNN model, and it outperformed the state-of-the-art machine learning models significantly for WPDP. Furthermore, we defined hyperparameter instability and examined the threat and opportunity it presents for deep learning models on defect prediction.</jats:p>
container_issue	10
container_start_page	0
container_title	Applied Sciences
container_volume	9
format_de105	Article, E-Article
format_de14	Article, E-Article
format_de15	Article, E-Article
format_de520	Article, E-Article
format_de540	Article, E-Article
format_dech1	Article, E-Article
format_ded117	Article, E-Article
format_degla1	E-Article
format_del152	Buch
format_del189	Article, E-Article
format_dezi4	Article
format_dezwi2	Article, E-Article
format_finc	Article, E-Article
format_nrw	Article, E-Article
_version_	1792346215733526528
geogr_code	not assigned
last_indexed	2024-03-01T17:35:36.389Z
geogr_code_person	not assigned
openURL	url_ver=Z39.88-2004&ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fvufind.svn.sourceforge.net%3Agenerator&rft.title=An+Improved+CNN+Model+for+Within-Project+Software+Defect+Prediction&rft.date=2019-05-24&genre=article&issn=2076-3417&volume=9&issue=10&pages=2138&jtitle=Applied+Sciences&atitle=An+Improved+CNN+Model+for+Within-Project+Software+Defect+Prediction&aulast=Gao&aufirst=Houleng&rft_id=info%3Adoi%2F10.3390%2Fapp9102138&rft.language%5B0%5D=eng
SOLR
_version_	1792346215733526528
author	Pan, Cong, Lu, Minyan, Xu, Biao, Gao, Houleng
author_facet	Pan, Cong, Lu, Minyan, Xu, Biao, Gao, Houleng, Pan, Cong, Lu, Minyan, Xu, Biao, Gao, Houleng
author_sort	pan, cong
container_issue	10
container_start_page	0
container_title	Applied Sciences
container_volume	9
description	<jats:p>To improve software reliability, software defect prediction is used to find software bugs and prioritize testing efforts. Recently, some researchers introduced deep learning models, such as the deep belief network (DBN) and the state-of-the-art convolutional neural network (CNN), and used automatically generated features extracted from abstract syntax trees (ASTs) and deep learning models to improve defect prediction performance. However, the research on the CNN model failed to reveal clear conclusions due to its limited dataset size, insufficiently repeated experiments, and outdated baseline selection. To solve these problems, we built the PROMISE Source Code (PSC) dataset to enlarge the original dataset in the CNN research, which we named the Simplified PROMISE Source Code (SPSC) dataset. Then, we proposed an improved CNN model for within-project defect prediction (WPDP) and compared our results to existing CNN results and an empirical study. Our experiment was based on a 30-repetition holdout validation and a 10 * 10 cross-validation. Experimental results showed that our improved CNN model was comparable to the existing CNN model, and it outperformed the state-of-the-art machine learning models significantly for WPDP. Furthermore, we defined hyperparameter instability and examined the threat and opportunity it presents for deep learning models on defect prediction.</jats:p>
doi_str_mv	10.3390/app9102138
facet_avail	Online, Free
finc_class_facet	Physik, Informatik, Chemie und Pharmazie, Technik, Allgemeines
format	ElectronicArticle
format_de105	Article, E-Article
format_de14	Article, E-Article
format_de15	Article, E-Article
format_de520	Article, E-Article
format_de540	Article, E-Article
format_dech1	Article, E-Article
format_ded117	Article, E-Article
format_degla1	E-Article
format_del152	Buch
format_del189	Article, E-Article
format_dezi4	Article
format_dezwi2	Article, E-Article
format_finc	Article, E-Article
format_nrw	Article, E-Article
geogr_code	not assigned
geogr_code_person	not assigned
id	ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMzM5MC9hcHA5MTAyMTM4
imprint	MDPI AG, 2019
imprint_str_mv	MDPI AG, 2019
institution	DE-Brt1, DE-Zwi2, DE-D161, DE-Zi4, DE-Gla1, DE-15, DE-Pl11, DE-Rs1, DE-14, DE-105, DE-Ch1, DE-L229, DE-D275, DE-Bn3
issn	2076-3417
issn_str_mv	2076-3417
language	English
last_indexed	2024-03-01T17:35:36.389Z
match_str	pan2019animprovedcnnmodelforwithinprojectsoftwaredefectprediction
mega_collection	MDPI AG (CrossRef)
physical	2138
publishDate	2019
publishDateSort	2019
publisher	MDPI AG
record_format	ai
recordtype	ai
series	Applied Sciences
source_id	49
spelling	Pan, Cong Lu, Minyan Xu, Biao Gao, Houleng 2076-3417 MDPI AG Fluid Flow and Transfer Processes Computer Science Applications Process Chemistry and Technology General Engineering Instrumentation General Materials Science http://dx.doi.org/10.3390/app9102138 <jats:p>To improve software reliability, software defect prediction is used to find software bugs and prioritize testing efforts. Recently, some researchers introduced deep learning models, such as the deep belief network (DBN) and the state-of-the-art convolutional neural network (CNN), and used automatically generated features extracted from abstract syntax trees (ASTs) and deep learning models to improve defect prediction performance. However, the research on the CNN model failed to reveal clear conclusions due to its limited dataset size, insufficiently repeated experiments, and outdated baseline selection. To solve these problems, we built the PROMISE Source Code (PSC) dataset to enlarge the original dataset in the CNN research, which we named the Simplified PROMISE Source Code (SPSC) dataset. Then, we proposed an improved CNN model for within-project defect prediction (WPDP) and compared our results to existing CNN results and an empirical study. Our experiment was based on a 30-repetition holdout validation and a 10 * 10 cross-validation. Experimental results showed that our improved CNN model was comparable to the existing CNN model, and it outperformed the state-of-the-art machine learning models significantly for WPDP. Furthermore, we defined hyperparameter instability and examined the threat and opportunity it presents for deep learning models on defect prediction.</jats:p> An Improved CNN Model for Within-Project Software Defect Prediction Applied Sciences
spellingShingle	Pan, Cong, Lu, Minyan, Xu, Biao, Gao, Houleng, Applied Sciences, An Improved CNN Model for Within-Project Software Defect Prediction, Fluid Flow and Transfer Processes, Computer Science Applications, Process Chemistry and Technology, General Engineering, Instrumentation, General Materials Science
title	An Improved CNN Model for Within-Project Software Defect Prediction
title_full	An Improved CNN Model for Within-Project Software Defect Prediction
title_fullStr	An Improved CNN Model for Within-Project Software Defect Prediction
title_full_unstemmed	An Improved CNN Model for Within-Project Software Defect Prediction
title_short	An Improved CNN Model for Within-Project Software Defect Prediction
title_sort	an improved cnn model for within-project software defect prediction
title_unstemmed	An Improved CNN Model for Within-Project Software Defect Prediction
topic	Fluid Flow and Transfer Processes, Computer Science Applications, Process Chemistry and Technology, General Engineering, Instrumentation, General Materials Science
url	http://dx.doi.org/10.3390/app9102138