The population initialization affects the performance of subgroup discovery evolutionary algorithms in high dimensional datasets

TORREÃO, Vítor de Albuquerque

Por favor, use este identificador para citar o enlazar este ítem: https://repositorio.ufpe.br/handle/123456789/34516

Comparte esta pagina

Título :	The population initialization affects the performance of subgroup discovery evolutionary algorithms in high dimensional datasets
Autor :	TORREÃO, Vítor de Albuquerque
Palabras clave :	Inteligência artificial; Aprendizagem de máquina; Mineração de dados
Fecha de publicación :	15-mar-2019
Editorial :	Universidade Federal de Pernambuco
Resumen :	Knowledge Discovery in Databases (KDD) is a broad area in Artificial Intelligence concerned with the extraction of useful information and insights from a given dataset. Among the distinct extraction methodologies, an important subclass of KDD tasks, called Subgroup Discovery (SD), undertakes the discovery of interesting subsets in the data. Many Evolutionary Algorithms (EAs) have been proposed to solve the Subgroup Discovery task with considerable success in low dimensional datasets. Some of these, however, have been shown to perform poorly in high dimensional problems. The currently best performing Evolutionary Algorithm for Subgroup Discovery in high dimensional datasets, SSDP, has a peculiar way of initializing its populations, limiting the individuals to the smallest possible size. As with most population-based techniques, the outcome of an Evolutionary Algorithm is usually dependent on the initial set of solutions, which are typically generated at random. The impact of choosing one initialization technique over another in the final presented solution has been the topic of many published works in the broad area of evolutionary computation. Despite this, there is still a lack of studies which approach this topic in the specific scenario of Subgroup Discovery tasks, especially when considering high dimensional datasets. The ultimate goal of this research project is to evaluate the impact of initial population generation in the end result of the overall Evolutionary Algorithm used to solve a Subgroup Discovery task in high dimensional data. Specifically, we provide new initialization methods, designed for the specific characteristics of Subgroup Discovery tasks, which can be used in virtually any EA. Our conducted experiments show that, by just changing the initialization method, state of the art Evolutionary Algorithms have their performance increased in high dimensional datasets.
URI :	https://repositorio.ufpe.br/handle/123456789/34516
Aparece en las colecciones:	Dissertações de Mestrado - Ciência da Computação

Ficheros en este ítem:

Fichero	Descripción	Tamaño	Formato
DISSERTAÇÃO Vitor de Albuquerque Torreão.pdf		823,37 kB	Adobe PDF	Visualizar/Abrir

Este ítem está protegido por copyright original

Visualizar la licencia

Mostrar el registro Dublin Core completo del ítem Recomiende este ítem

Este ítem está sujeto a una licencia Creative Commons Licencia Creative Commons