An innovative approach for predicting default risk in  peer-to-peer lending using stacking ensemble models  with explainable machine learning

Peer-to-peer (P2P) lending has increased significantly during the past few years on a global scale. However, there are several challenges associated with P2P lending’s rapid rise. The major challenges are imbalanced datasets, which make machine learning difficult, an excessive number of features, and low-performing classification algorithms. Furthermore, machine learning models face another complex challenge referred to as the black-box problem. To address these challenges, an innovative approach was developed by first applying Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance in the Bondora dataset, followed by the implementation of multiple feature selection techniques: Chi-Square (filter), Sequential Backward Selection (SBS) (wrapper), and embedded methods such as Random Forest (RF), Gradient Boosting Machine (GBM), Light Gradient Boosting Machine (LightGBM), and Categorical Boosting (CatBoost). A range of classifiers, linear (Logistic Regression (LR)), non-linear (Support Vector Machine (SVM), Naive Bayes (NB), and tree-based models (Decision Tree (DT), RF, Adaptive Boosting (AdaBoost), CatBoost), were then used to predict loan defaults. The top-performing models were integrated into various stacking ensembles using GBM, Extreme Gradient Boosting (XGBoost), and LightGBM as meta-learners to enhance predictive accuracy. The results declared that LightGBM exhibited an outstanding performance with accuracy, F-score, and Area Under the Curve (AUC) values of 0.981, 0.980, and 0.994, respectively, showing better performance than that reported in the literature. Explainable models were employed to interpret predictions and enhance user trust. Specifically, the LightGBM stacking model was combined with the Local Interpretable Model-agnostic Explanations (LIME) framework to provide interpretable insights into its prediction results.

Description

SJR 2025 1.067 Q1 H-Index 166 Subject Area and Category: Computer Science Artificial Intelligence Software

Keywords

Explainable machine learning models, Feature selection, Loan default risk, P2P lending, Prediction algorithms, Stacking models

Citation

Atef, M., Gabr, M. I., Seoud, W., & Ouf, S. (2026). An innovative approach for predicting default risk in peer-to-peer lending using stacking ensemble models with explainable machine learning. Neural Computing and Applications, 38(12). https://doi.org/10.1007/s00521-026-12226-5 ‌

URI

https://repository.msa.edu.eg/handle/123456789/6787

Collections

Faculty Of Management Sciences Research Paper

Full item page

An innovative approach for predicting default risk in peer-to-peer lending using stacking ensemble models with explainable machine learning

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Series Info

Doi

Scientific Journal Rankings

Orcid

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By