Clustering algorithms with new automatic variables weighting

RIZO RODRÍGUEZ, Sara Inés

Please use this identifier to cite or link to this item: https://repositorio.ufpe.br/handle/123456789/44859

Share on

Title:	Clustering algorithms with new automatic variables weighting
Authors:	RIZO RODRÍGUEZ, Sara Inés
Keywords:	Inteligência computacional; Agrupamento
Issue Date:	21-Feb-2022
Publisher:	Universidade Federal de Pernambuco
Citation:	RIZO RODRÍGUEZ, Sara Inés. Clustering algorithms with new automatic variables weighting. 2022. Tese (Doutorado em Ciência da Computação) - Universidade Federal de Pernambuco, Recife, 2022.
Abstract:	Every day a large amount of information is stored or represented as data for further analysis and management. Data analysis plays an indispensable role in understanding different phenomena. One of the vital means of handling these data is to classify or group them into a set of categories or clusters. Clustering or cluster analysis aims to divide a collection of data items into clusters given a measure of similarity. Clustering has been used in various fields, such as image processing, data mining, pattern recognition, and statistical analysis. Usually, clustering methods deal with objects described by real-valued variables. Nevertheless, this representation is too restrictive for representing complex data, such as lists, histograms, or even intervals. Furthermore, in some problems, many dimensions are irrelevant and can mask existing clusters, e.g., groups may exist in different subsets of features. This work focuses on the clustering analysis of data points described by both real-valued and interval-valued variables. In this regard, new clustering algorithms have been proposed, in which the correlation and relevance of variables are considered to improve their performance. In the case of interval- valued data, we assume that the boundaries of the interval-valued variables have the same and different importance for the clustering process. Since regularization-based methods are robust for initializations, the proposed approaches introduce a regularization term for controlling the membership degree of the objects. Such regularizations are popular due to high performance in large-scale data clustering and low computational complexity. These three-step iterative algorithms provide a fuzzy partition, a representative for each cluster, and the relevance weight of the variables or their correlation by minimizing a suitable objective function. Experiments on synthetic and real datasets corroborate the robustness and usefulness of the proposed clustering methods.
URI:	https://repositorio.ufpe.br/handle/123456789/44859
Appears in Collections:	Teses de Doutorado - Ciência da Computação

Files in This Item:

File	Description	Size	Format
TESE Sara Inés Rizo Rodríguez.pdf		4.74 MB	Adobe PDF	View/Open

This item is protected by original copyright

View License

Show full item record Recommend this item

This item is licensed under a Creative Commons License