Batch algorithms and fixed prediction rates for online Just-In-Time Software Defect Prediction

PESSOA, Dinaldo Andrade

Por favor, use este identificador para citar o enlazar este ítem: https://repositorio.ufpe.br/handle/123456789/40529

Comparte esta pagina

Título :	Batch algorithms and fixed prediction rates for online Just-In-Time Software Defect Prediction
Autor :	PESSOA, Dinaldo Andrade
Palabras clave :	Inteligência Computacional; Predição de defeito de software; Latência de verificação; Desbalanceamento de classes
Fecha de publicación :	26-mar-2021
Editorial :	Universidade Federal de Pernambuco
Citación :	PESSOA, Dinaldo Andrade. Batch algorithms and fixed prediction rates for online Just-In-Time Software Defect Prediction. 2021. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2021.
Resumen :	Just-In-Time Software Defect Prediction (JIT-SDP) is aimed at predicting the presence of defects in code changes at the commit time instead of inspecting modules (i.e., files or packages) in offline mode, as performed in traditional Software Defect Prediction (SDP). In a real-world application of JIT-SDP, predictions must be done in an online fashion so that the developer is informed about the presence of defect as soon as the code change is submitted, providing to the developer the opportunity to further inspect the change while it is still fresh in one’s mind. On the other hand, the model training can be done in an online or a batch fashion, since this problem domain does not have real-time requirements. Regardless the type of training, it is important to note that the code change is not labeled immediately after its submission to the source code repository. The labelling time may take days or months, depending on the time spent by the software development team to find and fix each defect. So, the model must wait some time to trust in a label of a code change. And this amount of time is known as verification latency. Another challenge faced by a JIT-SDP model is the fluctuation of the class imbalance rate through time. This kind of concept drift is known as class imbalance evolution. This work investigates the use of batch algorithms for dealing with JIT-SDP in the context of verification latency and class imbalance evolution. In comparison to the state-of-the-art, which is based on online algorithms, our approach (BORB) achieved improvements between +2% and +11% on 9 of the 10 investigated datasets, in terms of g-mean. In only one dataset, BORB achieved a result inferior to the state-of-the-art approach, a decrease of −2% in terms of g-mean. Besides that, this work investigates the predictive performance in a context in which the model is constrained to output a fixed defect prediction rate. More specifically, the defect prediction rate is an online rate that corresponds to the number of predictions which return the defect class divided by the total of predictions in a time interval. And a fixed defect prediction rate means to constraint the model to maintain the specified rate over time. That said, the results of the experiments show that, under this constraint, methods with higher capability to maintain the defect prediction rate close to the fixed defect prediction set by the hyperparameter tuning also obtain a higher predictive performance in the testing data, i.e., there is a meaningful correlation between this capability and the predictive performance. The correlation coefficient between them is 0.44. This result, added to the simplicity of the approach, suggests that a fixed defect prediction rate may be used as a standard baseline to the problem of class imbalance evolution.
URI :	https://repositorio.ufpe.br/handle/123456789/40529
Aparece en las colecciones:	Dissertações de Mestrado - Ciência da Computação

Ficheros en este ítem:

Fichero	Descripción	Tamaño	Formato
DISSERTAÇÃO Dinaldo Andrade Pessoa.pdf		4,91 MB	Adobe PDF	Visualizar/Abrir

Este ítem está protegido por copyright original

Visualizar la licencia

Mostrar el registro Dublin Core completo del ítem Recomiende este ítem

Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons