On Developing Generic Models for Predicting Student Outcomes in Educational Data Mining

dc.citation.issue1
dc.citation.volume6
dc.contributor.authorRamaswami G
dc.contributor.authorSusnjak T
dc.contributor.authorMathrani A
dc.contributor.editorCowling, M
dc.contributor.editorJha, M
dc.date.accessioned2023-11-20T01:37:45Z
dc.date.available2022-01-07
dc.date.available2023-11-20T01:37:45Z
dc.date.issued2022-01-07
dc.description.abstractPoor academic performance of students is a concern in the educational sector, especially if it leads to students being unable to meet minimum course requirements. However, with timely prediction of students’ performance, educators can detect at-risk students, thereby enabling early interventions for supporting these students in overcoming their learning difficulties. However, the majority of studies have taken the approach of developing individual models that target a single course while developing prediction models. These models are tailored to specific attributes of each course amongst a very diverse set of possibilities. While this approach can yield accurate models in some instances, this strategy is associated with limitations. In many cases, overfitting can take place when course data is small or when new courses are devised. Additionally, maintaining a large suite of models per course is a significant overhead. This issue can be tackled by developing a generic and course-agnostic predictive model that captures more abstract patterns and is able to operate across all courses, irrespective of their differences. This study demonstrates how a generic predictive model can be developed that identifies at-risk students across a wide variety of courses. Experiments were conducted using a range of algorithms, with the generic model producing an effective accuracy. The findings showed that the CatBoost algorithm performed the best on our dataset across the F-measure, ROC (receiver operating characteristic) curve and AUC scores; therefore, it is an excellent candidate algorithm for providing solutions on this domain given its capabilities to seamlessly handle categorical and missing data, which is frequently a feature in educational datasets.
dc.description.confidentialfalse
dc.edition.editionEducational Data Mining and Technology
dc.format.extent1 - 16 (16)
dc.identifier6
dc.identifier.citationBig Data and Cognitive Computing, 2022, Educational Data Mining and Technology, 6 (1), pp. 1 - 16 (16)
dc.identifier.doi10.3390/bdcc6010006
dc.identifier.elements-id450272
dc.identifier.harvestedMassey_Dark
dc.identifier.issn2504-2289
dc.identifier.urihttps://hdl.handle.net/10179/16831
dc.publisherMDPI (Basel, Switzerland)
dc.relation.isPartOfBig Data and Cognitive Computing
dc.relation.urihttps://www.mdpi.com/2504-2289/6/1/6
dc.rightsCC BY
dc.subjectat-risk students
dc.subjectCatBoost
dc.subjectearly prediction
dc.subjecteducational data mining
dc.subjectmachine learning
dc.titleOn Developing Generic Models for Predicting Student Outcomes in Educational Data Mining
dc.typeJournal article
massey.relation.uri-descriptionPublished version
pubs.notesNot known
pubs.organisational-group/Massey University
pubs.organisational-group/Massey University/College of Sciences
pubs.organisational-group/Massey University/College of Sciences/School of Mathematical and Computational Sciences
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
BDCC-06-00006.pdf
Size:
9.63 MB
Format:
Adobe Portable Document Format
Description:
Collections