Using structured and unstructured data for product price prediction

CARVALHO, Giovanni Paolo Santos de

Please use this identifier to cite or link to this item: https://repositorio.ufpe.br/handle/123456789/39488

Share on

Title:	Using structured and unstructured data for product price prediction
Authors:	CARVALHO, Giovanni Paolo Santos de
Keywords:	Inteligência computacional; Otimização
Issue Date:	17-Jan-2020
Publisher:	Universidade Federal de Pernambuco
Citation:	CARVALHO, Giovanni Paolo Santos de. Using structured and unstructured data for product price prediction. 2020. Dissertação (Mestrado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2020.
Abstract:	Product price estimation is a relatively new trend in e-commerce that helps customers in their decision making process of buying or selling a product, giving a starting point of what could be a fair price. In this work, we are particularly interested in performing price prediction from online product offers. These offers usually present some text describing the product in natural language (unstructured data) and the specification of the product composed of its properties (structured data). In this dissertation, we aim to predict the price of product offers based on both structured and unstructured information. For that, we propose an attention-based network that deals with structured data individually, and also the interaction between this data and unstructured data, combining them to perform the prediction. For the structured information, we apply a regular fully-connected network; and to model the interaction between them (product’s properties and its description), we employ a co-attention network. Those networks are combined and used by a neural network regressor to learn a vector representation of the product offer. This vector can then be used as a feature set by any regressor to perform product price prediction. This architecture is designed to operate with general structured and unstructured types of product offers, and in this particular study, it is evaluated on a car price prediction task, for which we collected a dataset by scraping 11 sources of car classifieds. Our experimental evaluation shows that: (1) regressors using the learned embedding obtained the best results, improving their performance in almost all scenarios in comparison to raw features; and (2) simple linear regressor models such as Linear Regression using the learned embedding achieved comparable results to more competitive algorithms such as LightGBM.
URI:	https://repositorio.ufpe.br/handle/123456789/39488
Appears in Collections:	Dissertações de Mestrado - Ciência da Computação

Files in This Item:

File	Description	Size	Format
DISSERTAÇÃO Giovanni Paolo Santos de Carvalho.pdf		5,42 MB	Adobe PDF	View/Open

This item is protected by original copyright

View License

Show full item record Recommend this item

This item is licensed under a Creative Commons License