Personal bankruptcy prediction using machine learning techniques

Authors

DOI:

https://doi.org/10.18559/ebr.2024.2.1149

Keywords:

personal bankruptcy, random forest, XGBoost, LightGBM, AdaBoost, CatBoost, support vector machines, household finance, SHAP

Abstract

It has become crucial to have an early prediction model that provides accurate assurance for users about the financial situation of consumers. Recent studies have focused on predicting corporate bankruptcies and credit defaults, not personal bankruptcies. Due to this situation, the present study fills the literature gap by comparing different machine learning algorithms to predict personal bankruptcy. The main objective of the study is to examine the usefulness of machine learning models such as SVM, random forest, AdaBoost, XGBoost, LightGBM, and CatBoost in forecasting personal bankruptcy. The study relies on two samples of households (learning and testing) from the Survey of Consumer Finances, which was conducted in the United States. Among the models estimated, LightGBM, CatBoost, and XGBoost showed the highest effectiveness. The most important variables used in the models are income, refusal to grant credit, delays in the repayment of liabilities, the revolving debt ratio, and the housing debt ratio.

Downloads

Download data is not yet available.

References

Al Daoud, E. (2019). Comparison between XGBoost, LightGBM and CatBoost using a home credit dataset. International Journal of Computer and Information Engineering, 13(1), 6–10.
View in Google Scholar

Alam, N., Gao, J., & Jones, S. (2021). Corporate failure prediction: An evaluation of deep learning vs discrete hazard models. Journal of International Financial Markets, Institutions and Money, 75, 101455. https://doi.org/10.1016/j.intfin.2021.101455 DOI: https://doi.org/10.1016/j.intfin.2021.101455
View in Google Scholar

Alfaro, E., García, N., Gámez, M., & Elizondo, D. (2008). Bankruptcy forecasting: An empirical comparison of AdaBoost and neural networks. Decision Support Systems, 45(1), 110–122. https://doi.org/10.1016/j.dss.2007.12.002 DOI: https://doi.org/10.1016/j.dss.2007.12.002
View in Google Scholar

Altman, E. I., & Kuehne, B. J. (2016). Credit markets and bubbles: Is the benign credit cycle over? Economics and Business Review, 2(3), 20–31. https://doi.org/10.18559/ebr.2016.3.3 DOI: https://doi.org/10.18559/ebr.2016.3.3
View in Google Scholar

Barboza, F., Basso, L. F. C., & Kimura, H. (2021). New metrics and approaches for predicting bankruptcy. Communications in Statistics-Simulation and Computation, 52(6), 2615–2632. https://doi.org/10.1080/03610918.2021.1910837 DOI: https://doi.org/10.1080/03610918.2021.1910837
View in Google Scholar

Barboza, F., Kimura, H., & Altman, E. (2017). Machine learning models and bankruptcy prediction. Expert Systems with Applications, 83, 405–417. https://doi.org/10.1016/j.eswa.2017.04.006 DOI: https://doi.org/10.1016/j.eswa.2017.04.006
View in Google Scholar

Berlemann, M., & Salland, J. (2016). The Joneses’ income and debt market participation: Empirical evidence from bank account data. Economics Letters, 142, 6–9. https://doi.org/10.1016/j.econlet.2016.02.030 DOI: https://doi.org/10.1016/j.econlet.2016.02.030
View in Google Scholar

Bragoli, D., Ferretti, C., Ganugi, P., Marseguerra, G., Mezzogori, D., & Zammori, F. (2022). Machine ­learning models for bankruptcy prediction: do industrial variables matter? Spatial Economic Analysis, 17(2), 156–177. https://doi.org/10.1080/17421772.2021.1977377 DOI: https://doi.org/10.1080/17421772.2021.1977377
View in Google Scholar

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. DOI: https://doi.org/10.1023/A:1010933404324
View in Google Scholar

Brotcke, L. (2022). Time to assess bias in machine learning models for credit decisions. Journal of Risk and Financial Management, 15(4), 165. https://doi.org/10.3390/ jrfm15040165 DOI: https://doi.org/10.3390/jrfm15040165
View in Google Scholar

Brygała, M. (2022). Consumer bankruptcy prediction using balanced and imbalanced data. Risks, 10(2), 24. https://doi.org/10.3390/risks10020024 DOI: https://doi.org/10.3390/risks10020024
View in Google Scholar

Bussmann, N., Giudici, P., Marinelli, D., & Papenbrock, J. (2020). Explainable AI in fintech risk management. Frontiers in Artificial Intelligence, 3, 26. https://doi.org/10.3389/frai.2020.00026 DOI: https://doi.org/10.3389/frai.2020.00026
View in Google Scholar

Carmona, P., Dwekat, A., & Mardawi, Z. (2022). No more black boxes! Explaining the predictions of a machine learning XGBoost classifier algorithm in business failu- re. Research in International Business and Finance, 61, 101649. https://doi.org/10.1016/j.ribaf.2022.101649 DOI: https://doi.org/10.1016/j.ribaf.2022.101649
View in Google Scholar

Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD international conference on knowledge di- scovery and data mining, pp. 785–794. https://doi.org/10.1145/2939672.2939785 DOI: https://doi.org/10.1145/2939672.2939785
View in Google Scholar

CFPB (Consumer Financial Protection Bureau). (2022). Is a lender allowed to con- sider my age or where my income comes from when deciding whether to give me a loan? https://www.consumerfinance.gov/ask­cfpb/is­a­lender­allowed­to­consider­my­age­or­where­my­income­comes­from­when­deciding­whether­to­give­me­a­loan­en­1181/
View in Google Scholar

Coşer, A., Maer­matei, M. M., & Albu, C. (2019). Predictive models for loan default risk assessment. Economic Computation & Economic Cybernetics Studies & Research, 53(2). https://doi.org/10.24818/18423264/53.2.19.09 DOI: https://doi.org/10.24818/18423264/53.2.19.09
View in Google Scholar

de Castro Vieira, J. R., Barboza, F., Sobreiro, V. A., & Kimura, H. (2019). Machine learning models for credit analysis improvements: Predicting low­income families’ default. Applied Soft Computing, 83, 105640. https://doi.org/10.1016/j.asoc.2019.105640 DOI: https://doi.org/10.1016/j.asoc.2019.105640
View in Google Scholar

Dorogush, A. V., Ershov, V., & Gulin, A. (2018). CatBoost: Gradient boosting with categorical features support. https://doi.org/10.48550/arXiv.1810.11363
View in Google Scholar

Freund, Y., & Schapire, R. E. (1997). A decision­theoretic generalization of on­line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139. https://doi.org/10.1006/jcss.1997.1504 DOI: https://doi.org/10.1006/jcss.1997.1504
View in Google Scholar

Garcia, J. (2022). Bankruptcy prediction using synthetic sampling. Machine Learning with Applications, 9, 100343. https://doi.org/10.1016/j.mlwa.2022.100343 DOI: https://doi.org/10.1016/j.mlwa.2022.100343
View in Google Scholar

Georgarakos, D., Haliassos, M., & Pasini, G. (2014). Household debt and social interactions. The Review of Financial Studies, 27(5), 1404–1433. https://doi.org/10.1093/rfs/hhu014 DOI: https://doi.org/10.1093/rfs/hhu014
View in Google Scholar

Gramegna, A., & Giudici, P. (2021). SHAP and LIME: An evaluation of discriminative power in credit risk. Frontiers in Artificial Intelligence, 4, 752558. https://doi.org/10.3389/frai.2021.752558 DOI: https://doi.org/10.3389/frai.2021.752558
View in Google Scholar

Halim, Z., Shuhidan, S. M., & Sanusi, Z. M. (2021). Corporation financial distress prediction with deep learning: Analysis of public listed companies in Malaysia. Business Process Management Journal, 274), 1163–1178. https://doi.org/10.1108/bpmj­06­2020­0273 DOI: https://doi.org/10.1108/BPMJ-06-2020-0273
View in Google Scholar

Hancock, J. T., & Khoshgoftaar, T. M. (2020). CatBoost for big data: An interdisciplinary review. Journal of Big Data, 7(1), 94. https://doi.org/10.1186/s40537­020­00369­8 DOI: https://doi.org/10.1186/s40537-020-00369-8
View in Google Scholar

Heo, J., & Yang, J. Y. (2014). AdaBoost based bankruptcy forecasting of Korean construction companies. Applied Soft Computing, 24, 494–499. https://doi.org/10.1016/j.asoc.2014.08.009 DOI: https://doi.org/10.1016/j.asoc.2014.08.009
View in Google Scholar

Jabeur, S. B., Gharib, C., Mefteh­Wali, S., & Arfi, W. B. (2021). CatBoost model and artificial intelligence techniques for corporate failure prediction. Technological Forecasting and Social Change, 166, 120658. https://doi.org/10.1016/j.techfore.2021.120658 DOI: https://doi.org/10.1016/j.techfore.2021.120658
View in Google Scholar

Jabeur, S. B., Mefteh­Wali, S., & Viviani, J. L. (2021). Forecasting gold price with the XGBoost algorithm and SHAP interaction values. Annals of Operations Research, 334, 679–699. https://doi.org/10.1007/s10479­021­04187­w DOI: https://doi.org/10.1007/s10479-021-04187-w
View in Google Scholar

Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. Y. (2017). LightGBM: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems, 30.
View in Google Scholar

Khare, N., & Sait, S. Y. (2018). Credit card fraud detection using machine learning models and collating machine learning models. International Journal of Pure and Applied Mathematics, 118(20), 825–838.
View in Google Scholar

Korol, T. (2021). Examining statistical methods in forecasting financial energy of households in Poland and Taiwan. Energies, 14(7), 1821. https://doi.org/10.3390/en14071821 DOI: https://doi.org/10.3390/en14071821
View in Google Scholar

Korol, T., & Fotiadis, A. K. (2022). Implementing artificial intelligence in forecasting the risk of personal bankruptcies in Poland and Taiwan. Oeconomia Copernicana, 13(2), 407. https://doi.org/10.24136/oc.2022.013 DOI: https://doi.org/10.24136/oc.2022.013
View in Google Scholar

Kovacova, M., Kliestik, T., Valaskova, K., Durana, P., & Juhaszova, Z. (2019). Systematic review of variables applied in bankruptcy prediction models of Visegrad group countries. Oeconomia Copernicana, 10(4), 743–772. https://doi.org/10.24136/oc.2019.034 DOI: https://doi.org/10.24136/oc.2019.034
View in Google Scholar

Kovacova, M., & Kliestikova, J. (2017). Modelling bankruptcy prediction models in Slovak companies. SHS Web of Conferences, vol. 39, p. 01013. EDP Sciences. https://doi.org/10.1051/shsconf/20173901013 DOI: https://doi.org/10.1051/shsconf/20173901013
View in Google Scholar

Le, T., Lee, M. Y., Park, J. R., & Baik, S. W. (2018). Oversampling techniques for bank- ruptcy prediction: Novel features from a transaction dataset. Symmetry, 10(4), 79. https://doi.org/10.3390/sym10040079 DOI: https://doi.org/10.3390/sym10040079
View in Google Scholar

Letza, S. R., Kalupa, Ł., & Kowalski, T. (2003). Predicting corporate failure: How useful are multi­discriminant analysis models? Economics and Business Review, 3(2), 5–11. https://doi.org/10.18559/ebr.2003.2.494 DOI: https://doi.org/10.18559/ebr.2003.2.494
View in Google Scholar

Liang, D., Lu, C. C., Tsai, C. F., & Shih, G. A. (2016). Financial ratios and corporate governance indicators in bankruptcy prediction: A comprehensive study. European Journal of Operational Research, 252(2), 561–572. https://doi.org/10.1016/j.ejor.2016.01.012 DOI: https://doi.org/10.1016/j.ejor.2016.01.012
View in Google Scholar

Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30.
View in Google Scholar

Machado, M. R., & Karray, S. (2022). Assessing credit risk of commercial customers using hybrid machine learning algorithms. Expert Systems with Applications, 200, 116889. https://doi.org/10.1016/j.eswa.2022.116889 DOI: https://doi.org/10.1016/j.eswa.2022.116889
View in Google Scholar

Mangalathu, S., Hwang, S. H., & Jeon, J. S. (2020). Failure mode and effects analysis of RC members based on machine­learning­based SHapley Additive exPlanations (SHAP) approach. Engineering Structures, 219, 110927. https://doi.org/10.1016/j.engstruct.2020.110927 DOI: https://doi.org/10.1016/j.engstruct.2020.110927
View in Google Scholar

Mihalovič, M. (2016). Performance comparison of multiple discriminant analysis and logit models in bankruptcy prediction. Economics & Sociology, 9(4). https://doi.org/10.14254/2071­789x.2016/9­4/6 DOI: https://doi.org/10.14254/2071-789X.2016/9-4/6
View in Google Scholar

Mo, H., Sun, H., Liu, J., & Wei, S. (2019). Developing window behavior models for residential buildings using XGBoost algorithm. Energy and Buildings, 205, 109564. https://doi.org/10.1016/j.enbuild.2019.109564 DOI: https://doi.org/10.1016/j.enbuild.2019.109564
View in Google Scholar

Papík, M., & Papíková, L. (2023). Impacts of crisis on SME bankruptcy prediction models’ performance. Expert Systems with Applications, 214, 119072. https://doi.org/10.1016/j.eswa.2022.119072 DOI: https://doi.org/10.1016/j.eswa.2022.119072
View in Google Scholar

Papík, M., Papíková, L., Kajanová, J., & Bečka, M. (2023). CatBoost: The case of bankruptcy prediction. International Conference on Business and Technology, pp. 3–17. Springer. DOI: https://doi.org/10.1007/978-3-031-08084-5_3
View in Google Scholar

Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., & Gulin, A. (2018). CatBoost: Unbiased boosting with categorical features. Advances in Neural Information Processing Systems, 31.
View in Google Scholar

Saarela, M., & Jauhiainen, S. (2021). Comparison of feature importance measures as explanations for classification models. SN Applied Sciences, 3, 272. https://doi.org/10.1007/s42452­021­04148­9 DOI: https://doi.org/10.1007/s42452-021-04148-9
View in Google Scholar

Sahiq, A. N. M., Ismail, S., Nor, S. H. S., Ul­Saufie, A. Z., & Yaacob, W. F. W. (2022, September). Application of logistic regression model on imbalanced data in per- sonal bankruptcy prediction. 2022 3rd International Conference on Artificial Intelligence and Data Sciences (AiDAS) (pp. 120–125). IEEE. https://doi.org/10.1109/aidas56890.2022.9918779 DOI: https://doi.org/10.1109/AiDAS56890.2022.9918779
View in Google Scholar

Schonlau, M., & Zou, R. Y. (2020). The random forest algorithm for statistical learning. The Stata Journal, 20(1), 3–29. https://doi.org/10.1177/1536867x20909688 DOI: https://doi.org/10.1177/1536867X20909688
View in Google Scholar

Shi, S., Tse, R., Luo, W., D’Addona, S., & Pau, G. (2022). Machine learning­driven credit risk: A systemic review. Neural Computing and Applications, 34(17), 14327–14339. https://doi.org/10.1007/s00521­022­07472­2 DOI: https://doi.org/10.1007/s00521-022-07472-2
View in Google Scholar

Son, H., Hyun, C., Phan, D., & Hwang, H. J. (2019). Data analytic approach for bankruptcy prediction. Expert Systems with Applications, 138, 112816. https://doi.org/10.1016/j.eswa.2019.07.033 DOI: https://doi.org/10.1016/j.eswa.2019.07.033
View in Google Scholar

Syam, N., & Sharma, A. (2018). Waiting for a sales renaissance in the fourth industrial revolution: Machine learning and artificial intelligence in sales research and practice. Industrial Marketing Management, 69, 135–146. https://doi.org/10.1016/j.indmarman.2017.12.019 DOI: https://doi.org/10.1016/j.indmarman.2017.12.019
View in Google Scholar

Syed Nor, S. H., Ismail, S., & Yap, B. W. (2019). Personal bankruptcy prediction using decision tree model. Journal of Economics, Finance and Administrative Science, 24(47), 157–170. https://doi.org/10.1108/jefas­08­2018­0076 DOI: https://doi.org/10.1108/JEFAS-08-2018-0076
View in Google Scholar

Wang, D. N., Li, L., & Zhao, D. (2022). Corporate finance risk prediction based on LightGBM. Information Sciences, 602, 259–268. https://doi.org/10.1016/j.ins.2022.04.058 DOI: https://doi.org/10.1016/j.ins.2022.04.058
View in Google Scholar

Wu, D. J., Feng, T., Naehrig, M., & Lauter, K. E. (2016). Privately evaluating decision trees and random forests. Proceedings on Privacy Enhancing Technologies, (4), 335–355. https://doi.org/10.1515/popets­2016­0043 DOI: https://doi.org/10.1515/popets-2016-0043
View in Google Scholar

Yen, S. J., & Lee, Y. S. (2009). Cluster­based under­sampling approaches for imbalanced data distributions. Expert Systems with Applications, 36(3), 5718–5727. https://doi.org/10.1016/j.eswa.2008.06.108 DOI: https://doi.org/10.1016/j.eswa.2008.06.108
View in Google Scholar

Zelenkov, Y., & Volodarskiy, N. (2021). Bankruptcy prediction on the base of the unbalanced data using multi­objective selection of classifiers. Expert Systems with Applications, 185, 115559. https://doi.org/10.1016/j.eswa.2021.115559 DOI: https://doi.org/10.1016/j.eswa.2021.115559
View in Google Scholar

Zhang, L., Wang, J., & Liu, Z. (2023). What should lenders be more concerned aboult prediction model. Expert Systems with Applications, 213, 118938. https://doi.org/10.1016/j.eswa.2022.118938 DOI: https://doi.org/10.1016/j.eswa.2022.118938
View in Google Scholar

Downloads

Published

2024-06-12

How to Cite

Brygała, M., & Korol, T. (2024). Personal bankruptcy prediction using machine learning techniques. Economics and Business Review, 10(2). https://doi.org/10.18559/ebr.2024.2.1149

Issue

Section

Research article- regular issue