Abstract
Classifying medical data is currently one of the most complex and challenging medical informatics tasks, as most medical datasets are incomplete, nonlinear, and complex. The medical datasets have attracted researchers to be utilized in analysis and prediction to help the public in early disease detection and protect people from future fatal diseases. Consequently, machine learning techniques have been widely employed to build digitalized decision support systems using a specific set of related medical datasets. As such, we propose a generalized predictive model in this paper for disease prediction using five different types of medical datasets of frequently occurring diseases. In particular, the proposed generalized prediction model collates the principal component analysis (PCA) and iterative K-means clustering for data quality improvement and exploits the random forest (RF) predictor for prediction accuracy enhancement. Experimental results illustrate that the proposed application of the PCA enhances the iterative K-means clustering performance and, therefore, the RF predictor's accuracy for all the examined medical datasets. We also utilize Grid-search-CV hyperparameter optimization technique to select the best parameter for which the RF predictor delivers the best performance for each medical dataset. Experimental analysis manifests that our proposed generalized predictive model outperforms the existing methods and, thus, could assist healthcare practitioners in effective decision-making.
| Original language | English |
|---|---|
| Title of host publication | Generative AI in Neurodegenerative Disorders |
| Subtitle of host publication | Innovations, Views, and Obstacles |
| Publisher | River Publishers |
| Pages | 123-146 |
| Number of pages | 24 |
| ISBN (Electronic) | 9788743801740 |
| ISBN (Print) | 9788743801757 |
| Publication status | Published - 5 Jul 2025 |
Bibliographical note
Publisher Copyright:© 2025 River Publishers. All rights reserved.