An innovative approach for predicting default risk in peer-to-peer lending using stacking ensemble models with explainable machine learning
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Springer London
Series Info
Neural Computing and Applications ; Volume 38 , Issue 12 , Article number 504
Scientific Journal Rankings
Orcid
Abstract
Peer-to-peer (P2P) lending has increased significantly during the past few years on a global scale. However, there are several challenges associated with P2P lending’s rapid rise. The major challenges are imbalanced datasets, which make machine learning difficult, an excessive number of features, and low-performing classification algorithms. Furthermore, machine learning models face another complex challenge referred to as the black-box problem. To address these challenges, an innovative approach was developed by first applying Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance in the Bondora dataset, followed by the implementation of multiple feature selection techniques: Chi-Square (filter), Sequential Backward Selection (SBS) (wrapper), and embedded methods such as Random Forest (RF), Gradient Boosting Machine (GBM), Light Gradient Boosting Machine (LightGBM), and Categorical Boosting (CatBoost). A range of classifiers, linear (Logistic Regression (LR)), non-linear (Support Vector Machine (SVM), Naive Bayes (NB), and tree-based models (Decision Tree (DT), RF, Adaptive Boosting (AdaBoost), CatBoost), were then used to predict loan defaults. The top-performing models were integrated into various stacking ensembles using GBM, Extreme Gradient Boosting (XGBoost), and LightGBM as meta-learners to enhance predictive accuracy. The results declared that LightGBM exhibited an outstanding performance with accuracy, F-score, and Area Under the Curve (AUC) values of 0.981, 0.980, and 0.994, respectively, showing better performance than that reported in the literature. Explainable models were employed to interpret predictions and enhance user trust. Specifically, the LightGBM stacking model was combined with the Local Interpretable Model-agnostic Explanations (LIME) framework to provide interpretable insights into its prediction results.
Description
SJR 2025
1.067
Q1
H-Index
166
Subject Area and Category:
Computer Science
Artificial Intelligence
Software
Citation
Atef, M., Gabr, M. I., Seoud, W., & Ouf, S. (2026). An innovative approach for predicting default risk in peer-to-peer lending using stacking ensemble models with explainable machine learning. Neural Computing and Applications, 38(12). https://doi.org/10.1007/s00521-026-12226-5
