Use este identificador para citar ou linkar para este item:
https://repositorio.ufpe.br/handle/123456789/54097
Compartilhe esta página
Título: | Batch som algorithms for dissimilarity data |
Autor(es): | PALOMINO MARIÑO, Laura María |
Palavras-chave: | Inteligência computacional; Mapas auto-organizáveis; SOM em lotes; Dados de dissimilaridade; Medoides ponderados |
Data do documento: | 1-Set-2023 |
Editor: | Universidade Federal de Pernambuco |
Citação: | PALOMINO MARIÑO, Laura María. Batch som algorithms for dissimilarity data. 2023. Tese (Doutorado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2023. |
Abstract: | The Self-Organizing Maps (SOM) are unsupervised neural network methods that have both clustering and visualization properties. Originally, the SOM algorithm was defined for numerical data. However, complex data, require differentiated analysis and treatment that are consis- tent with their structures. Some kinds of data are known only through relational measures of resemblance or dissemblance such as DNA sequences. Currently, despite their usefulness, rela- tively few SOM models can manage relational data. This research proposes four new families of batch SOM algorithms for relational data represented by one dissimilarity matrix (single- view) or several dissimilarity matrices (multi-view). The algorithms are designed to give a crisp partition and to preserve the topological properties of the data on the map. The algorithms implemented the following four cluster representation approaches: the first family are SOM methods that consider the cluster representatives as vectors of weights whose components measure how objects are weighted as a medoid in a given cluster. Moreover, in the second family, each cluster representative is a normalized linear combination of the objects represented in the description space. In the third family, the cluster representative is a set of weighted ob- jects whose cardinality is fixed. Finally, in the fourth family, the representative is a vector of weighted objects selected according to their relevance to the referred cluster. Additionally, the multi-view methods are designed to learn the weight of each dissimilarity matrix. The weight represents the relevance of each dissimilarity matrix being computed either locally for each cluster or globally for the whole partition. All the proposed algorithms were compared with the most related benchmark methods available to handle one or several dissimilarity matrices. Experiments on 12 single-view and 14 multi-view datasets are performed by means of a simi- lar parametrization regarding the Normalized Mutual Information (NMI), Topographical Error (TE), and Silhouette Coefficient (SIL) metrics. In most cases, the fourth family of algorithms performed the best concerning NMI and SIL whereas the second family of algorithms are the best in terms of TE. The statistical significance of the results provided by the experiments was assessed using the non-parametric Friedman test and the Nemenyi post-test. The experiments on the multi-view dataset showed the importance of considering the weights of the relevance of dissimilarity matrices. |
URI: | https://repositorio.ufpe.br/handle/123456789/54097 |
Aparece nas coleções: | Teses de Doutorado - Ciência da Computação |
Arquivos associados a este item:
Arquivo | Descrição | Tamanho | Formato | |
---|---|---|---|---|
TESE Laura María Palomino Mariño.pdf Item embargado até 2025-11-29 | 4,28 MB | Adobe PDF | Visualizar/Abrir Item embargado |
Este arquivo é protegido por direitos autorais |
Este item está licenciada sob uma Licença Creative Commons