New internal index for clustering validation based on graphs
Fecha
2017Resumen
This paper presents two different versions of a new internal index for clustering validation using graphs. These graphs capture the structural characteristics of each cluster. In this way, the new index overcomes the limitations of traditional indices based on statistics measurements and it is effective on clusters of different shapes and sizes. These graphs are generated through an iterative process based on the principal component analysis, which partitions the clusters in a configurable number of “sub-clusters”. Then, a minimum spanning tree based on the centroids of each of these sub-clusters is built and used to estimate
both the quality of the clusters and the distances between them. In particular, the quality of a cluster is defined in this paper as the level of “cohesion” among its sub-clusters. The difference between the two versions of the proposed index is how this level of "cohesion" is measured. Finally, a comparison of the performance of these two versions of the proposed index with a selected group of well-known internal indices is carried out. In these tests, the two versions of the index show a superior capacity to deal with datasets that present different configurations of variances, densities, geometries and levels of noise.
Fuente
Expert Systems with Applications, 86, 334-349Link de Acceso
Click aquí para ver el documentoIdentificador DOI
doi.org/10.1016/j.eswa.2017.06.003Colecciones
La publicación tiene asociados los siguientes ficheros de licencia:
Excepto si se señala otra cosa, la licencia de la publicación se describe como Atribución-NoComercial-SinDerivadas 3.0 Chile
Publicaciones relacionadas
Mostrando publicaciones relacionadas por Título, autor o materia.
-
Spatial analysis of risk of morbidity and mortality by COVID-19 in Europe and the mediterranean in the year 2020
Andrades-Grassi, Jesús E.; Cuesta-Herrera, Ledys; Bianchi-Pérez, Guillermo; Grassi, Hilda C.; López-Hernández, Juan Y.; Torres-Mantilla, H (2021)El mapeo de enfermedades busca representar el riesgo de una enfermedad. El objetivo de este trabajo es hacer un análisis del riesgo para la pandemia de COVID-19 en Europa y el Mediterráneo. Se utilizaron los datos de ...
-
Segmentation of consumer preference for food safety label on vegetables: Consumer profiles in central and south central Chile
Adasme-Berríos, Cristian; Sánchez, Mercedes; Mora, Marcos; Schnettler, Berta; Lobos, Germán; Díaz, José (2016)Purpose : The purpose of this paper is to explore the differences in consumers’ preferences to food safety label (FSL) on vegetables in central and south central Chile in terms of sociodemographic characteristics, consumers’ ...
-
Performance analysis of clustering internal validation indexes with asymmetric clusters
The present work evaluates the performance of a set of internal clustering indexes in artificial and real data sets regarding a specific structural characteristic. In particular, it deals with data sets whose clusters ...