Understanding confusion in code reviews

EBERT, Felipe

Use este identificador para citar ou linkar para este item: https://repositorio.ufpe.br/handle/123456789/33481

Compartilhe esta página

Título:	Understanding confusion in code reviews
Autor(es):	EBERT, Felipe
Palavras-chave:	Engenharia de software; Revisão de código
Data do documento:	15-Fev-2019
Editor:	Universidade Federal de Pernambuco
Abstract:	Code review is a technique of systematic examination of a code change. It is an important practice for software quality assurance. The benefits of code reviews are well-known, such as decreasing the number of defects, improving software quality, and knowledge transfer. Nevertheless, they can also incur costs on software development projects as they can delay the merge of a code change and, consequently, slow down the overall development process. Furthermore, performing a code review might not be such an easy task, it will probably require developers’ knowledge about the code change and the context of the system. Hence, the merge of a code change can be further delayed if reviewers experience difficulties in understanding the change. In fact, understanding the code change and its context is one of the main issues reviewers face during a code review. In this thesis, we tackle two important problems related to confusion in code reviews: the lack of knowledge in the research community about confusion in code reviews; and the lack of tools for confusion identification in code review comments. In the first study, we address the first problem: we create an understanding of what constitutes confusion by building a definition of confusion, and a confusion coding scheme. Then, we manually annotate several code review comments and build an automated approach for detecting confusion to address the second problem. Our classifiers present a considerable performance on the classification of confusion. Moreover, to improve the current understanding on confusion in code reviews, we conduct a second study aiming at identifying the reasons for confusion, its impacts, and how developers cope with confusion. As such, we re-annotate the aforementioned code review comments and conduct a survey of developers. Based on our findings, we provide a model of confusion in context with 30 reasons for confusion, 14 impacts, and 13 coping strategies. The most frequent reasons for confusion are: missing rationale, and discussion of non-functional requirements of the solution. The most popular impacts of confusion are: the delay on the merge decision, and the decrease on the review quality. The most common strategies developers adopt to cope with confusion are: requesting information, and improving familiarity with existing code. During the former studies, we observe that identification of confusion in questions is a challenging task and that communicative intentions are one of the reasons for confusion. Hence, we decided to conduct an in-depth analysis of the communicative intention of developers’ questions in code reviews in the third study. We categorise 499 questions into 12 different categories of intentions. Even though the majority of questions actually serve information seeking goals, they still represent fewer than half of the annotated sample. These results suggest that questions are actually used by developers in code review to serve a wider variety of communicative purposes, including suggestions, requests for action, and criticism.
Descrição:	LIMA FILHO, Fernando José Castor de, também é conhecido(a) em citações bibliográficas por: CASTOR FILHO, Fernando
URI:	https://repositorio.ufpe.br/handle/123456789/33481
Aparece nas coleções:	Teses de Doutorado - Ciência da Computação

Arquivos associados a este item:

Arquivo	Descrição	Tamanho	Formato
TESE Felipe Ebert.pdf		4,02 MB	Adobe PDF	Visualizar/Abrir

Este arquivo é protegido por direitos autorais

Ver licença

Mostrar registro completo do item Recomendar este item Visualizar estatísticas

Este item está licenciada sob uma Licença Creative Commons