Batch algorithms and fixed prediction rates for online Just-In-Time Software Defect Prediction

PESSOA, Dinaldo Andrade

Use este identificador para citar ou linkar para este item: https://repositorio.ufpe.br/handle/123456789/40529

Compartilhe esta página

Título:	Batch algorithms and fixed prediction rates for online Just-In-Time Software Defect Prediction
Autor(es):	PESSOA, Dinaldo Andrade
Palavras-chave:	Inteligência Computacional; Predição de defeito de software; Latência de verificação; Desbalanceamento de classes
Data do documento:	26-Mar-2021
Editor:	Universidade Federal de Pernambuco
Citação:	PESSOA, Dinaldo Andrade. Batch algorithms and fixed prediction rates for online Just-In-Time Software Defect Prediction. 2021. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2021.
Abstract:	Just-In-Time Software Defect Prediction (JIT-SDP) is aimed at predicting the presence of defects in code changes at the commit time instead of inspecting modules (i.e., files or packages) in offline mode, as performed in traditional Software Defect Prediction (SDP). In a real-world application of JIT-SDP, predictions must be done in an online fashion so that the developer is informed about the presence of defect as soon as the code change is submitted, providing to the developer the opportunity to further inspect the change while it is still fresh in one’s mind. On the other hand, the model training can be done in an online or a batch fashion, since this problem domain does not have real-time requirements. Regardless the type of training, it is important to note that the code change is not labeled immediately after its submission to the source code repository. The labelling time may take days or months, depending on the time spent by the software development team to find and fix each defect. So, the model must wait some time to trust in a label of a code change. And this amount of time is known as verification latency. Another challenge faced by a JIT-SDP model is the fluctuation of the class imbalance rate through time. This kind of concept drift is known as class imbalance evolution. This work investigates the use of batch algorithms for dealing with JIT-SDP in the context of verification latency and class imbalance evolution. In comparison to the state-of-the-art, which is based on online algorithms, our approach (BORB) achieved improvements between +2% and +11% on 9 of the 10 investigated datasets, in terms of g-mean. In only one dataset, BORB achieved a result inferior to the state-of-the-art approach, a decrease of −2% in terms of g-mean. Besides that, this work investigates the predictive performance in a context in which the model is constrained to output a fixed defect prediction rate. More specifically, the defect prediction rate is an online rate that corresponds to the number of predictions which return the defect class divided by the total of predictions in a time interval. And a fixed defect prediction rate means to constraint the model to maintain the specified rate over time. That said, the results of the experiments show that, under this constraint, methods with higher capability to maintain the defect prediction rate close to the fixed defect prediction set by the hyperparameter tuning also obtain a higher predictive performance in the testing data, i.e., there is a meaningful correlation between this capability and the predictive performance. The correlation coefficient between them is 0.44. This result, added to the simplicity of the approach, suggests that a fixed defect prediction rate may be used as a standard baseline to the problem of class imbalance evolution.
URI:	https://repositorio.ufpe.br/handle/123456789/40529
Aparece nas coleções:	Dissertações de Mestrado - Ciência da Computação

Arquivos associados a este item:

Arquivo	Descrição	Tamanho	Formato
DISSERTAÇÃO Dinaldo Andrade Pessoa.pdf		4,91 MB	Adobe PDF	Visualizar/Abrir

Este arquivo é protegido por direitos autorais

Ver licença

Mostrar registro completo do item Recomendar este item Visualizar estatísticas

Este item está licenciada sob uma Licença Creative Commons