Personal bankruptcy prediction using machine learning techniques

Magdalena Brygała; Tomasz Korol

doi:10.18559/ebr.2024.2.1149

Authors

Magdalena Brygała Faculty of Management and Economics, Gdansk University of Technology, Gdańsk, Poland https://orcid.org/0000-0003-3222-1046
Tomasz Korol Faculty of Management and Economics, Gdansk University of Technology, Gdańsk, Poland https://orcid.org/0000-0002-7623-3404

DOI:

https://doi.org/10.18559/ebr.2024.2.1149

Keywords:

personal bankruptcy, random forest, XGBoost, LightGBM, AdaBoost, CatBoost, support vector machines, household finance, SHAP

Abstract

It has become crucial to have an early prediction model that provides accurate assurance for users about the financial situation of consumers. Recent studies have focused on predicting corporate bankruptcies and credit defaults, not personal bankruptcies. Due to this situation, the present study fills the literature gap by comparing different machine learning algorithms to predict personal bankruptcy. The main objective of the study is to examine the usefulness of machine learning models such as SVM, random forest, AdaBoost, XGBoost, LightGBM, and CatBoost in forecasting personal bankruptcy. The study relies on two samples of households (learning and testing) from the Survey of Consumer Finances, which was conducted in the United States. Among the models estimated, LightGBM, CatBoost, and XGBoost showed the highest effectiveness. The most important variables used in the models are income, refusal to grant credit, delays in the repayment of liabilities, the revolving debt ratio, and the housing debt ratio.

Downloads

Download data is not yet available.

References

Al Daoud, E. (2019). Comparison between XGBoost, LightGBM and CatBoost using a home credit dataset. International Journal of Computer and Information Engineering, 13(1), 6–10.
View in Google Scholar

Alam, N., Gao, J., & Jones, S. (2021). Corporate failure prediction: An evaluation of deep learning vs discrete hazard models. Journal of International Financial Markets, Institutions and Money, 75, 101455. https://doi.org/10.1016/j.intfin.2021.101455
View in Google Scholar DOI: https://doi.org/10.1016/j.intfin.2021.101455

Alfaro, E., García, N., Gámez, M., & Elizondo, D. (2008). Bankruptcy forecasting: An empirical comparison of AdaBoost and neural networks. Decision Support Systems, 45(1), 110–122. https://doi.org/10.1016/j.dss.2007.12.002
View in Google Scholar DOI: https://doi.org/10.1016/j.dss.2007.12.002

Altman, E. I., & Kuehne, B. J. (2016). Credit markets and bubbles: Is the benign credit cycle over? Economics and Business Review, 2(3), 20–31. https://doi.org/10.18559/ebr.2016.3.3
View in Google Scholar DOI: https://doi.org/10.18559/ebr.2016.3.3

Barboza, F., Basso, L. F. C., & Kimura, H. (2021). New metrics and approaches for predicting bankruptcy. Communications in Statistics-Simulation and Computation, 52(6), 2615–2632. https://doi.org/10.1080/03610918.2021.1910837
View in Google Scholar DOI: https://doi.org/10.1080/03610918.2021.1910837

Barboza, F., Kimura, H., & Altman, E. (2017). Machine learning models and bankruptcy prediction. Expert Systems with Applications, 83, 405–417. https://doi.org/10.1016/j.eswa.2017.04.006
View in Google Scholar DOI: https://doi.org/10.1016/j.eswa.2017.04.006

Berlemann, M., & Salland, J. (2016). The Joneses’ income and debt market participation: Empirical evidence from bank account data. Economics Letters, 142, 6–9. https://doi.org/10.1016/j.econlet.2016.02.030
View in Google Scholar DOI: https://doi.org/10.1016/j.econlet.2016.02.030

Bragoli, D., Ferretti, C., Ganugi, P., Marseguerra, G., Mezzogori, D., & Zammori, F. (2022). Machine learning models for bankruptcy prediction: do industrial variables matter? Spatial Economic Analysis, 17(2), 156–177. https://doi.org/10.1080/17421772.2021.1977377
View in Google Scholar DOI: https://doi.org/10.1080/17421772.2021.1977377

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
View in Google Scholar DOI: https://doi.org/10.1023/A:1010933404324

Brotcke, L. (2022). Time to assess bias in machine learning models for credit decisions. Journal of Risk and Financial Management, 15(4), 165. https://doi.org/10.3390/ jrfm15040165
View in Google Scholar DOI: https://doi.org/10.3390/jrfm15040165

Brygała, M. (2022). Consumer bankruptcy prediction using balanced and imbalanced data. Risks, 10(2), 24. https://doi.org/10.3390/risks10020024
View in Google Scholar DOI: https://doi.org/10.3390/risks10020024

Bussmann, N., Giudici, P., Marinelli, D., & Papenbrock, J. (2020). Explainable AI in fintech risk management. Frontiers in Artificial Intelligence, 3, 26. https://doi.org/10.3389/frai.2020.00026
View in Google Scholar DOI: https://doi.org/10.3389/frai.2020.00026

Carmona, P., Dwekat, A., & Mardawi, Z. (2022). No more black boxes! Explaining the predictions of a machine learning XGBoost classifier algorithm in business failu- re. Research in International Business and Finance, 61, 101649. https://doi.org/10.1016/j.ribaf.2022.101649
View in Google Scholar DOI: https://doi.org/10.1016/j.ribaf.2022.101649

Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD international conference on knowledge di- scovery and data mining, pp. 785–794. https://doi.org/10.1145/2939672.2939785
View in Google Scholar DOI: https://doi.org/10.1145/2939672.2939785

CFPB (Consumer Financial Protection Bureau). (2022). Is a lender allowed to con- sider my age or where my income comes from when deciding whether to give me a loan? https://www.consumerfinance.gov/askcfpb/isalenderallowedtoconsidermyageorwheremyincomecomesfromwhendecidingwhethertogivemealoanen1181/
View in Google Scholar

Coşer, A., Maermatei, M. M., & Albu, C. (2019). Predictive models for loan default risk assessment. Economic Computation & Economic Cybernetics Studies & Research, 53(2). https://doi.org/10.24818/18423264/53.2.19.09
View in Google Scholar DOI: https://doi.org/10.24818/18423264/53.2.19.09

de Castro Vieira, J. R., Barboza, F., Sobreiro, V. A., & Kimura, H. (2019). Machine learning models for credit analysis improvements: Predicting lowincome families’ default. Applied Soft Computing, 83, 105640. https://doi.org/10.1016/j.asoc.2019.105640
View in Google Scholar DOI: https://doi.org/10.1016/j.asoc.2019.105640

Dorogush, A. V., Ershov, V., & Gulin, A. (2018). CatBoost: Gradient boosting with categorical features support. https://doi.org/10.48550/arXiv.1810.11363
View in Google Scholar

Freund, Y., & Schapire, R. E. (1997). A decisiontheoretic generalization of online learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139. https://doi.org/10.1006/jcss.1997.1504
View in Google Scholar DOI: https://doi.org/10.1006/jcss.1997.1504

Garcia, J. (2022). Bankruptcy prediction using synthetic sampling. Machine Learning with Applications, 9, 100343. https://doi.org/10.1016/j.mlwa.2022.100343
View in Google Scholar DOI: https://doi.org/10.1016/j.mlwa.2022.100343

Georgarakos, D., Haliassos, M., & Pasini, G. (2014). Household debt and social interactions. The Review of Financial Studies, 27(5), 1404–1433. https://doi.org/10.1093/rfs/hhu014
View in Google Scholar DOI: https://doi.org/10.1093/rfs/hhu014

Gramegna, A., & Giudici, P. (2021). SHAP and LIME: An evaluation of discriminative power in credit risk. Frontiers in Artificial Intelligence, 4, 752558. https://doi.org/10.3389/frai.2021.752558
View in Google Scholar DOI: https://doi.org/10.3389/frai.2021.752558

Halim, Z., Shuhidan, S. M., & Sanusi, Z. M. (2021). Corporation financial distress prediction with deep learning: Analysis of public listed companies in Malaysia. Business Process Management Journal, 274), 1163–1178. https://doi.org/10.1108/bpmj0620200273
View in Google Scholar DOI: https://doi.org/10.1108/BPMJ-06-2020-0273

Hancock, J. T., & Khoshgoftaar, T. M. (2020). CatBoost for big data: An interdisciplinary review. Journal of Big Data, 7(1), 94. https://doi.org/10.1186/s40537020003698
View in Google Scholar DOI: https://doi.org/10.1186/s40537-020-00369-8

Heo, J., & Yang, J. Y. (2014). AdaBoost based bankruptcy forecasting of Korean construction companies. Applied Soft Computing, 24, 494–499. https://doi.org/10.1016/j.asoc.2014.08.009
View in Google Scholar DOI: https://doi.org/10.1016/j.asoc.2014.08.009

Jabeur, S. B., Gharib, C., MeftehWali, S., & Arfi, W. B. (2021). CatBoost model and artificial intelligence techniques for corporate failure prediction. Technological Forecasting and Social Change, 166, 120658. https://doi.org/10.1016/j.techfore.2021.120658
View in Google Scholar DOI: https://doi.org/10.1016/j.techfore.2021.120658

Jabeur, S. B., MeftehWali, S., & Viviani, J. L. (2021). Forecasting gold price with the XGBoost algorithm and SHAP interaction values. Annals of Operations Research, 334, 679–699. https://doi.org/10.1007/s1047902104187w
View in Google Scholar DOI: https://doi.org/10.1007/s10479-021-04187-w

Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. Y. (2017). LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 30.
View in Google Scholar

Khare, N., & Sait, S. Y. (2018). Credit card fraud detection using machine learning models and collating machine learning models. International Journal of Pure and Applied Mathematics, 118(20), 825–838.
View in Google Scholar

Korol, T. (2021). Examining statistical methods in forecasting financial energy of households in Poland and Taiwan. Energies, 14(7), 1821. https://doi.org/10.3390/en14071821
View in Google Scholar DOI: https://doi.org/10.3390/en14071821

Korol, T., & Fotiadis, A. K. (2022). Implementing artificial intelligence in forecasting the risk of personal bankruptcies in Poland and Taiwan. Oeconomia Copernicana, 13(2), 407. https://doi.org/10.24136/oc.2022.013
View in Google Scholar DOI: https://doi.org/10.24136/oc.2022.013

Kovacova, M., Kliestik, T., Valaskova, K., Durana, P., & Juhaszova, Z. (2019). Systematic review of variables applied in bankruptcy prediction models of Visegrad group countries. Oeconomia Copernicana, 10(4), 743–772. https://doi.org/10.24136/oc.2019.034
View in Google Scholar DOI: https://doi.org/10.24136/oc.2019.034

Kovacova, M., & Kliestikova, J. (2017). Modelling bankruptcy prediction models in Slovak companies. SHS Web of Conferences, vol. 39, p. 01013. EDP Sciences. https://doi.org/10.1051/shsconf/20173901013
View in Google Scholar DOI: https://doi.org/10.1051/shsconf/20173901013

Le, T., Lee, M. Y., Park, J. R., & Baik, S. W. (2018). Oversampling techniques for bank- ruptcy prediction: Novel features from a transaction dataset. Symmetry, 10(4), 79. https://doi.org/10.3390/sym10040079
View in Google Scholar DOI: https://doi.org/10.3390/sym10040079

Letza, S. R., Kalupa, Ł., & Kowalski, T. (2003). Predicting corporate failure: How useful are multidiscriminant analysis models? Economics and Business Review, 3(2), 5–11. https://doi.org/10.18559/ebr.2003.2.494
View in Google Scholar DOI: https://doi.org/10.18559/ebr.2003.2.494

Liang, D., Lu, C. C., Tsai, C. F., & Shih, G. A. (2016). Financial ratios and corporate governance indicators in bankruptcy prediction: A comprehensive study. European Journal of Operational Research, 252(2), 561–572. https://doi.org/10.1016/j.ejor.2016.01.012
View in Google Scholar DOI: https://doi.org/10.1016/j.ejor.2016.01.012

Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30.
View in Google Scholar

Machado, M. R., & Karray, S. (2022). Assessing credit risk of commercial customers using hybrid machine learning algorithms. Expert Systems with Applications, 200, 116889. https://doi.org/10.1016/j.eswa.2022.116889
View in Google Scholar DOI: https://doi.org/10.1016/j.eswa.2022.116889

Mangalathu, S., Hwang, S. H., & Jeon, J. S. (2020). Failure mode and effects analysis of RC members based on machinelearningbased SHapley Additive exPlanations (SHAP) approach. Engineering Structures, 219, 110927. https://doi.org/10.1016/j.engstruct.2020.110927
View in Google Scholar DOI: https://doi.org/10.1016/j.engstruct.2020.110927

Mihalovič, M. (2016). Performance comparison of multiple discriminant analysis and logit models in bankruptcy prediction. Economics & Sociology, 9(4). https://doi.org/10.14254/2071789x.2016/94/6
View in Google Scholar DOI: https://doi.org/10.14254/2071-789X.2016/9-4/6

Mo, H., Sun, H., Liu, J., & Wei, S. (2019). Developing window behavior models for residential buildings using XGBoost algorithm. Energy and Buildings, 205, 109564. https://doi.org/10.1016/j.enbuild.2019.109564
View in Google Scholar DOI: https://doi.org/10.1016/j.enbuild.2019.109564

Papík, M., & Papíková, L. (2023). Impacts of crisis on SME bankruptcy prediction models’ performance. Expert Systems with Applications, 214, 119072. https://doi.org/10.1016/j.eswa.2022.119072
View in Google Scholar DOI: https://doi.org/10.1016/j.eswa.2022.119072

Papík, M., Papíková, L., Kajanová, J., & Bečka, M. (2023). CatBoost: The case of bankruptcy prediction. International Conference on Business and Technology, pp. 3–17. Springer.
View in Google Scholar DOI: https://doi.org/10.1007/978-3-031-08084-5_3

Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., & Gulin, A. (2018). CatBoost: Unbiased boosting with categorical features. Advances in Neural Information Processing Systems, 31.
View in Google Scholar

Saarela, M., & Jauhiainen, S. (2021). Comparison of feature importance measures as explanations for classification models. SN Applied Sciences, 3, 272. https://doi.org/10.1007/s42452021041489
View in Google Scholar DOI: https://doi.org/10.1007/s42452-021-04148-9

Sahiq, A. N. M., Ismail, S., Nor, S. H. S., UlSaufie, A. Z., & Yaacob, W. F. W. (2022, September). Application of logistic regression model on imbalanced data in per- sonal bankruptcy prediction. 2022 3rd International Conference on Artificial Intelligence and Data Sciences (AiDAS) (pp. 120–125). IEEE. https://doi.org/10.1109/aidas56890.2022.9918779
View in Google Scholar DOI: https://doi.org/10.1109/AiDAS56890.2022.9918779

Schonlau, M., & Zou, R. Y. (2020). The random forest algorithm for statistical learning. The Stata Journal, 20(1), 3–29. https://doi.org/10.1177/1536867x20909688
View in Google Scholar DOI: https://doi.org/10.1177/1536867X20909688

Shi, S., Tse, R., Luo, W., D’Addona, S., & Pau, G. (2022). Machine learningdriven credit risk: A systemic review. Neural Computing and Applications, 34(17), 14327–14339. https://doi.org/10.1007/s00521022074722
View in Google Scholar DOI: https://doi.org/10.1007/s00521-022-07472-2

Son, H., Hyun, C., Phan, D., & Hwang, H. J. (2019). Data analytic approach for bankruptcy prediction. Expert Systems with Applications, 138, 112816. https://doi.org/10.1016/j.eswa.2019.07.033
View in Google Scholar DOI: https://doi.org/10.1016/j.eswa.2019.07.033

Syam, N., & Sharma, A. (2018). Waiting for a sales renaissance in the fourth industrial revolution: Machine learning and artificial intelligence in sales research and practice. Industrial Marketing Management, 69, 135–146. https://doi.org/10.1016/j.indmarman.2017.12.019
View in Google Scholar DOI: https://doi.org/10.1016/j.indmarman.2017.12.019

Syed Nor, S. H., Ismail, S., & Yap, B. W. (2019). Personal bankruptcy prediction using decision tree model. Journal of Economics, Finance and Administrative Science, 24(47), 157–170. https://doi.org/10.1108/jefas0820180076
View in Google Scholar DOI: https://doi.org/10.1108/JEFAS-08-2018-0076

Wang, D. N., Li, L., & Zhao, D. (2022). Corporate finance risk prediction based on LightGBM. Information Sciences, 602, 259–268. https://doi.org/10.1016/j.ins.2022.04.058
View in Google Scholar DOI: https://doi.org/10.1016/j.ins.2022.04.058

Wu, D. J., Feng, T., Naehrig, M., & Lauter, K. E. (2016). Privately evaluating decision trees and random forests. Proceedings on Privacy Enhancing Technologies, (4), 335–355. https://doi.org/10.1515/popets20160043
View in Google Scholar DOI: https://doi.org/10.1515/popets-2016-0043

Yen, S. J., & Lee, Y. S. (2009). Clusterbased undersampling approaches for imbalanced data distributions. Expert Systems with Applications, 36(3), 5718–5727. https://doi.org/10.1016/j.eswa.2008.06.108
View in Google Scholar DOI: https://doi.org/10.1016/j.eswa.2008.06.108

Zelenkov, Y., & Volodarskiy, N. (2021). Bankruptcy prediction on the base of the unbalanced data using multiobjective selection of classifiers. Expert Systems with Applications, 185, 115559. https://doi.org/10.1016/j.eswa.2021.115559
View in Google Scholar DOI: https://doi.org/10.1016/j.eswa.2021.115559

Zhang, L., Wang, J., & Liu, Z. (2023). What should lenders be more concerned aboult prediction model. Expert Systems with Applications, 213, 118938. https://doi.org/10.1016/j.eswa.2022.118938
View in Google Scholar DOI: https://doi.org/10.1016/j.eswa.2022.118938

Personal bankruptcy prediction using machine learning techniques

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

Journal identifiers

Evaluation markers

Make a Submission

Licenses

Sciendo

Latest publications

Keywords