Learning to detect text-code inconsistencies with weak and manual supervision

SOUZA, Beatriz Bezerra de

Por favor, use este identificador para citar o enlazar este ítem: https://repositorio.ufpe.br/handle/123456789/49318

Comparte esta pagina

Título :	Learning to detect text-code inconsistencies with weak and manual supervision
Autor :	SOUZA, Beatriz Bezerra de
Palabras clave :	Engenharia de software; Detecção de inconsistência
Fecha de publicación :	15-feb-2023
Editorial :	Universidade Federal de Pernambuco
Citación :	SOUZA, Beatriz Bezerra de. Learning to detect text-code inconsistencies with weak and manual supervision. 2023. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2023.
Resumen :	Source code often is associated with a natural language summary, enabling developers to understand the behavior and intent of the code. For example, method-level comments summarize the behavior of a method and test descriptions summarize the intent of a test case. Unfortunately, the text and its corresponding code sometimes are inconsistent, which may hinder code understanding, code reuse, and code maintenance. We propose TCID, an approach for Text-Code Inconsistency Detection, which trains a neural model to distinguish consistent from inconsistent text-code pairs. Our key contribution is to combine two ways of training such a model. First, TCID performs weakly supervised pre-training based on large amounts of consistent examples extracted from code as-is and inconsistent examples created by randomly recombining text-code pairs. Then, TCID fine-tunes the model based on a small and curated set of manually labeled examples. This combination is motivated by the observation that weak supervision alone leads to models that generalize poorly to real-world inconsistencies. Our evaluation applies the two-step training procedure to four state-of-the-art models and evaluates it on two text-vs-code problems: 40.7K method-level comments checked against the corresponding Java method body, and—as a problem not considered in prior work— 338.8K test case descriptions checked against corresponding JavaScript implementations. Our results show that a small amount of manual labeling enables the approach to significantly improve effectiveness, outperforming the current state of the art and improving the F1 score by 5% in Java and by 17% in JavaScript. We validate the usefulness of TCID’s predictions by submitting pull requests, of which 10 have been accepted so far.
URI :	https://repositorio.ufpe.br/handle/123456789/49318
Aparece en las colecciones:	Dissertações de Mestrado - Ciência da Computação

Ficheros en este ítem:

Fichero	Descripción	Tamaño	Formato
DISSERTAÇÃO Beatriz Bezerra de Souza.pdf		668.67 kB	Adobe PDF	View/Open

Este ítem está protegido por copyright original

Visualizar la licencia

Mostrar el registro Dublin Core completo del ítem Recomiende este ítem

Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons