Skip navigation
Use este identificador para citar ou linkar para este item:
Título: Shifted Gradient Similarity: A perceptual video quality assessment index for adaptive streaming encoding
Autor(es): MONTEIRO, Estêvão Chaves
Palavras-chave: Processamento de imagens; Qualidade visual; Compressão de vídeo.
Data do documento: 4-Mar-2016
Editor: Universidade Federal de Pernambuco
Resumo: Adaptive video streaming has become prominent due to the rising diversity of Web-enabled personal devices and the popularity of social networks. Common limitations in Internet bandwidth, decoding speed and battery power available in such devices challenge the efficiency of content encoders to preserve visual quality at reduced data rates over a wide range of display resolutions, typically compressing to lower than 1% of the massive raw data rate. Furthermore, the human visual system does not uniformly perceive losses of spatial and temporal information, so a simple physical objective model such as the mean squared error does not correlate well with perceptual quality. Objective assessment and prediction of perceptual quality of visual content has greatly improved in the past decade, but remains an open problem. Among the most relevant psychovisual quality metrics are the many versions of the Structural Similarity (SSIM) index. In this work, several of the most efficient SSIM-based metrics, such as the Multi-Scale Fast SSIM and the Gradient Magnitude Similarity Deviation (GMSD), are decomposed into their component techniques and reassembled in order to measure and understand the contribution of each technique and to develop improvements in quality and efficiency. The metrics are applied to the LIVE Mobile Video Quality and TID2008 databases and the results are correlated to the subjective data included in the databases in the form of mean opinion scores (MOS), so each metric’s degree of correlation indicates its ability to predict perceptual quality. Additionally, the metrics’ applicability to the recent, relevant psychovisal rate-distortion optimization (Psy-RDO) implementation in the x264 encoder, which currently lacks an ideal objective assessment metric, is investigated as well. The “Shifted Gradient Similarity” (SG-Sim) index is proposed with an improved feature enhancement by avoiding a common unintended loss of analysis information in SSIM-based indexes, and achieving considerably higher MOS correlation than the existing metrics investigated in this work. More efficient spatial pooling filters are proposed, as well: the decomposed 1-D integer Gaussian filter limited to two standard deviations, and the downsampling Box filter based on the integral image, which retain respectively 99% and 98% equivalence and achieve speed gains of respectively 68% and 382%. In addition, the downsampling filter also enables broader scalability, particularly for Ultra High Definition content, and defines the “Fast SG-Sim” index version. Furthermore, SG-Sim is found to improve correlation with Psy-RDO, as an ideal encoding quality metric for x264. Finally, the algorithms and experiments used in this work are implemented in the “Video Quality Assessment in Java” (jVQA) software, based on the AviSynth and FFmpeg platforms, and designed for customization and extensibility, supporting 4K Ultra-HD content and available as free, open source code.
Aparece na(s) coleção(ções):Dissertações de Mestrado - Ciência da Computação

Este arquivo é protegido por direitos autorais

Este item está licenciada sob uma Licença Creative Commons Creative Commons