A convolutional neural network approach for speech quality assesment

ALBUQUERQUE, Renato Quirino de

Use este identificador para citar ou linkar para este item: https://repositorio.ufpe.br/handle/123456789/38524

Compartilhe esta página

Título:	A convolutional neural network approach for speech quality assesment
Autor(es):	ALBUQUERQUE, Renato Quirino de
Palavras-chave:	Ciência da computação; Redes neurais convolucionais
Data do documento:	20-Fev-2020
Editor:	Universidade Federal de Pernambuco
Citação:	ALBUQUERQUE, Renato Quirino de. A convolutional neural network approach for speech quality assesment. 2020. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2020.
Abstract:	An important aspect of speech understanding is quality, which can be defined as the fidelity of the signal in relation to its original (or idealized) version when a comparison is allowed. Despite being a subjective issue, there are approaches to measuring speech quality. The most effective approach consists of applying subjective tests, in which individuals evaluate the quality of the speech samples, associating them with quality indexes. However, there are automatic measurement applications that operate at lower costs and generate faster responses. Such solutions can be divided into methodologies that use only the sample to be evaluated (non-reference) and those that use the degraded and reference versions of the speech sample (full-reference). Unfortunately, for many current applications, it is impossible to obtain the original speech sample, requiring the development and application of non-reference techniques. Thus, this dissertation presents a model of convolutional neural network for speech quality assessment (CNN-SQA). This is a non-reference methodology that applies fully convolutional layers as extractors of characteristics for speech representation. In addition, fully-connected layers are used to perform the quality assessment step. For the entry of the model, some visual characteristics were evaluated, despite the use of MFCC coefficients having presented the best results. Other parameters of the new model were obtained through an iterative and incremental parameter selection process. The performance of the model was evaluated by comparing it with the PESQ, ViSQOL and P.563 methodologies. Other experiments present analyzes of the model’s behavior in isolated situations of speech and noise. The experiments were carried out on publicly available databases, as well as on a new database built to evaluate the new methodology in the context of background noise. Finally, the results were analyzed using correlation measures and statistical descriptions.
URI:	https://repositorio.ufpe.br/handle/123456789/38524
Aparece nas coleções:	Dissertações de Mestrado - Ciência da Computação

Arquivos associados a este item:

Arquivo	Descrição	Tamanho	Formato
DISSERTAÇÃO Renato Quirino de Albuquerque.pdf		2,8 MB	Adobe PDF	Visualizar/Abrir

Este arquivo é protegido por direitos autorais

Ver licença

Mostrar registro completo do item Recomendar este item Visualizar estatísticas

Este item está licenciada sob uma Licença Creative Commons