Please use this identifier to cite or link to this item:
https://repositorio.ufpe.br/handle/123456789/30523
Share on
Title: | A fuzzy partitional clustering algorithm with adaptative euclidean distance and entropy regularization |
Authors: | RIZO RODRÍGUEZ, Sara Inés |
Keywords: | Mineração de dados; Agrupamento difuso |
Issue Date: | 21-Feb-2018 |
Publisher: | Universidade Federal de Pernambuco |
Abstract: | Data Clustering is one of the most important issues in data mining and machine learning. Clustering is a task of discovering homogeneous groups of the studied objects. Recently, many researchers have a significant interest in developing clustering algorithms. The most problem in clustering is that we do not have prior information knowledge about the given dataset. The traditional clustering approaches are designed for searching clusters in the entire space. However, in high-dimensional real world datasets, there are usually many irrelevant dimensions for clustering, where the traditional clustering methods work often improperly. Subspace clustering is an extension of traditional clustering that enables finding subspace clusters only in relevant dimensions within a data set. However, most subspace clustering methods usually suffer from the issue that their complicated parameter settings are almost troublesome to be determined, and therefore it can be difficult to implement these methods in practical applications. This work proposes a partitioning fuzzy clustering algorithm with entropy regularization and automatic variable selection through adaptive distance where the dissimilarity measure is obtained as the sum of the Euclidean distance between objects and prototypes calculated individually for each variable. The main advantage of the proposed approach to conventional clustering methods is the possibility of using adaptive distances, which change with each iteration of the algorithm. This type of dissimilarity measure is adequate to learn the weights of the variables dynamically during the clustering process, leading to an improvement of the performance of the algorithms. Another advantage of the proposed approach is the use of the entropy regularization term that serves as a regulating factor during the minimization process. The proposed method is an iterative three-step algorithm that provides a fuzzy partition, a representative for each fuzzy cluster. For this, an objective function that includes a multidimensional distance function as a measure of dissimilarity and entropy as the regularization term is minimized. Experiments on simulated, real world and image data corroborate the usefulness of the proposed algorithm. |
URI: | https://repositorio.ufpe.br/handle/123456789/30523 |
Appears in Collections: | Dissertações de Mestrado - Ciência da Computação |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
DISSERTAÇÃO Sara Inés Rizo Rodríguez.pdf | 7,08 MB | Adobe PDF | ![]() View/Open |
This item is protected by original copyright |
This item is licensed under a Creative Commons License