An innovative approach for predicting default risk in peer-to-peer lending using stacking ensemble models with explainable machine learning

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Springer London

Series Info

Neural Computing and Applications ; Volume 38 , Issue 12 , Article number 504

Orcid

Abstract

Peer-to-peer (P2P) lending has increased significantly during the past few years on a global scale. However, there are several challenges associated with P2P lending’s rapid rise. The major challenges are imbalanced datasets, which make machine learning difficult, an excessive number of features, and low-performing classification algorithms. Furthermore, machine learning models face another complex challenge referred to as the black-box problem. To address these challenges, an innovative approach was developed by first applying Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance in the Bondora dataset, followed by the implementation of multiple feature selection techniques: Chi-Square (filter), Sequential Backward Selection (SBS) (wrapper), and embedded methods such as Random Forest (RF), Gradient Boosting Machine (GBM), Light Gradient Boosting Machine (LightGBM), and Categorical Boosting (CatBoost). A range of classifiers, linear (Logistic Regression (LR)), non-linear (Support Vector Machine (SVM), Naive Bayes (NB), and tree-based models (Decision Tree (DT), RF, Adaptive Boosting (AdaBoost), CatBoost), were then used to predict loan defaults. The top-performing models were integrated into various stacking ensembles using GBM, Extreme Gradient Boosting (XGBoost), and LightGBM as meta-learners to enhance predictive accuracy. The results declared that LightGBM exhibited an outstanding performance with accuracy, F-score, and Area Under the Curve (AUC) values of 0.981, 0.980, and 0.994, respectively, showing better performance than that reported in the literature. Explainable models were employed to interpret predictions and enhance user trust. Specifically, the LightGBM stacking model was combined with the Local Interpretable Model-agnostic Explanations (LIME) framework to provide interpretable insights into its prediction results.

Description

SJR 2025 1.067 Q1 H-Index 166 Subject Area and Category: Computer Science Artificial Intelligence Software

Citation

Atef, M., Gabr, M. I., Seoud, W., & Ouf, S. (2026). An innovative approach for predicting default risk in peer-to-peer lending using stacking ensemble models with explainable machine learning. Neural Computing and Applications, 38(12). https://doi.org/10.1007/s00521-026-12226-5 ‌

Endorsement

Review

Supplemented By

Referenced By