Mostrar el registro sencillo de la publicación

dc.contributor.authorPalacios Rojas, Carlos
dc.contributor.authorReyes-Suárez, José A.
dc.contributor.authorBearzotti, Lorena A.
dc.contributor.authorLeiva, Víctor
dc.contributor.authorMarchant-Fuentes, Carolina
dc.date.accessioned2021-12-21T13:11:59Z
dc.date.available2021-12-21T13:11:59Z
dc.date.issued2021
dc.identifier.urihttp://repositorio.ucm.cl/handle/ucm/3649
dc.description.abstractData mining is employed to extract useful information and to detect patterns from often large data sets, closely related to knowledge discovery in databases and data science. In this investigation, we formulate models based on machine learning algorithms to extract relevant information predicting student retention at various levels, using higher education data and specifying the relevant variables involved in the modeling. Then, we utilize this information to help the process of knowledge discovery. We predict student retention at each of three levels during their first, second, and third years of study, obtaining models with an accuracy that exceeds 80% in all scenarios. These models allow us to adequately predict the level when dropout occurs. Among the machine learning algorithms used in this work are: decision trees, k-nearest neighbors, logistic regression, naive Bayes, random forest, and support vector machines, of which the random forest technique performs the best. We detect that secondary educational score and the community poverty index are important predictive variables, which have not been previously reported in educational studies of this type. The dropout assessment at various levels reported here is valid for higher education institutions around the world with similar conditions to the Chilean case, where dropout rates affect the efficiency of such institutions. Having the ability to predict dropout based on student’s data enables these institutions to take preventative measures, avoiding the dropouts. In the case study, balancing the majority and minority classes improves the performance of the algorithms.es_CL
dc.language.isoenes_CL
dc.rightsAtribución-NoComercial-SinDerivadas 3.0 Chile*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/cl/*
dc.sourceEntropy, 23(4), 485es_CL
dc.subjectData analyticses_CL
dc.subjectDatabaseses_CL
dc.subjectData sciencees_CL
dc.subjectFriedman testes_CL
dc.subjectSocioeconomic indexes_CL
dc.subjectUniversity dropoutes_CL
dc.titleKnowledge discovery for higher education student retention based on data mining: Machine learning algorithms and case study in Chilees_CL
dc.typeArticlees_CL
dc.ucm.facultadFacultad de Ciencias de la Ingenieríaes_CL
dc.ucm.indexacionScopuses_CL
dc.ucm.indexacionIsies_CL
dc.ucm.uriwww.mdpi.com/1099-4300/23/4/485es_CL
dc.ucm.doidoi.org/10.3390/e23040485es_CL


Ficheros en la publicación

Vista Previa No Disponible
Thumbnail

Esta publicación aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo de la publicación

Atribución-NoComercial-SinDerivadas 3.0 Chile
Excepto si se señala otra cosa, la licencia de la publicación se describe como Atribución-NoComercial-SinDerivadas 3.0 Chile