An innovative approach for predicting default risk in peer-to-peer lending using stacking ensemble models with explainable machine learning

dc.AffiliationOctober University for modern sciences and Arts MSA
dc.contributor.authorMarkus Atef
dc.contributor.authorMenna Ibrahim Gabr
dc.contributor.authorWafaa Seoud
dc.contributor.authorShimaa Ouf
dc.date.accessioned2026-06-26T09:44:20Z
dc.date.issued2026-06-16
dc.descriptionSJR 2025 1.067 Q1 H-Index 166 Subject Area and Category: Computer Science Artificial Intelligence Software
dc.description.abstractPeer-to-peer (P2P) lending has increased significantly during the past few years on a global scale. However, there are several challenges associated with P2P lending’s rapid rise. The major challenges are imbalanced datasets, which make machine learning difficult, an excessive number of features, and low-performing classification algorithms. Furthermore, machine learning models face another complex challenge referred to as the black-box problem. To address these challenges, an innovative approach was developed by first applying Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance in the Bondora dataset, followed by the implementation of multiple feature selection techniques: Chi-Square (filter), Sequential Backward Selection (SBS) (wrapper), and embedded methods such as Random Forest (RF), Gradient Boosting Machine (GBM), Light Gradient Boosting Machine (LightGBM), and Categorical Boosting (CatBoost). A range of classifiers, linear (Logistic Regression (LR)), non-linear (Support Vector Machine (SVM), Naive Bayes (NB), and tree-based models (Decision Tree (DT), RF, Adaptive Boosting (AdaBoost), CatBoost), were then used to predict loan defaults. The top-performing models were integrated into various stacking ensembles using GBM, Extreme Gradient Boosting (XGBoost), and LightGBM as meta-learners to enhance predictive accuracy. The results declared that LightGBM exhibited an outstanding performance with accuracy, F-score, and Area Under the Curve (AUC) values of 0.981, 0.980, and 0.994, respectively, showing better performance than that reported in the literature. Explainable models were employed to interpret predictions and enhance user trust. Specifically, the LightGBM stacking model was combined with the Local Interpretable Model-agnostic Explanations (LIME) framework to provide interpretable insights into its prediction results.
dc.description.urihttps://www.scimagojr.com/journalsearch.php?q=24800&tip=sid&clean=0
dc.identifier.citationAtef, M., Gabr, M. I., Seoud, W., & Ouf, S. (2026). An innovative approach for predicting default risk in peer-to-peer lending using stacking ensemble models with explainable machine learning. Neural Computing and Applications, 38(12). https://doi.org/10.1007/s00521-026-12226-5 ‌
dc.identifier.doihttps://doi.org/10.1007/s00521-026-12226-5
dc.identifier.otherhttps://doi.org/10.1007/s00521-026-12226-5
dc.identifier.urihttps://repository.msa.edu.eg/handle/123456789/6787
dc.language.isoen_US
dc.publisherSpringer London
dc.relation.ispartofseriesNeural Computing and Applications ; Volume 38 , Issue 12 , Article number 504
dc.subjectExplainable machine learning models
dc.subjectFeature selection
dc.subjectLoan default risk
dc.subjectP2P lending
dc.subjectPrediction algorithms
dc.subjectStacking models
dc.titleAn innovative approach for predicting default risk in peer-to-peer lending using stacking ensemble models with explainable machine learning
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
s00521-026-12226-5.pdf
Size:
3.89 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
51 B
Format:
Item-specific license agreed upon to submission
Description: