Please rotate your device to landscape mode to view the charts.

Modeling and predicting failure in US credit unions

Journal: International Journal of Forecasting

Date: 2025

Author: Qiao (Olivia) Peng, Donal McKillop, Barry Quinn, Kailong Liu

Abstract:
This study presents a random forest (RF)-based machine learning model to predict the liquidation of US credit unions one year in advance. The model demonstrates impressive accuracy on the test set (97.9% accuracy, with 2.0% false negatives and 8.8% false positives) when utilizing all 44 factors. Simplifying the model to only the top five factors based on feature importance analysis results in a slightly lower, but still significant, accuracy on the test set (92.2% accuracy, with 7.8% false negatives and 17.6% false positives). Comparisons with seven other classification methods verify the superiority of the RF model. This study also uses the Cox proportional-hazards model and Shapley value-based approaches to interpret key feature significance and interactions. The model provides regulators and credit unions with a valuable early warning system for potential failures, enabling corrective measures or strategic mergers to ultimately protect the National Credit Union Share Insurance Fund.

Link: Google Scholar

Background and Context

Research Focus

This study develops a machine learning model using random forest (RF) techniques to predict credit union failures one year in advance, aiming to provide an early warning system for regulators and stakeholders.

Industry Context

Despite growth in assets ($1.84 trillion) and memberships (124.3 million) by 2020, US credit unions have declined from 23,866 in 1969 to 5,099 in 2020 due to mergers and failures.

Methodology

The study analyzes 44 financial indicators from credit union data between 2001-2020, comparing RF model performance against seven other classification methods and using interpretable AI techniques to explain predictions.

Superior Performance of Random Forest Model vs Other Methods

Random Forest achieved the highest accuracy (97.9%) among all tested methods
The next best performer was Boosted Tree at 94.9%, while other methods ranged from 81-85% accuracy
This demonstrates the superior predictive power of the Random Forest approach

High Accuracy Maintained with Reduced Feature Set

Model maintains high accuracy (92.2%) even when reduced to just 5 key features
Small increase in false negatives (5.8%) and false positives (8.8%) with reduced features
Demonstrates that effective prediction is possible with a simplified model

Training vs Test Set Performance Comparison

Model performs better on test set (97.9%) than training set (89.1%)
Demonstrates strong generalization ability to new data
Indicates model is not overfitting to training data

Model Performance Over Time

Model maintains consistent high accuracy over the entire test period
Performance remains stable across different economic conditions
Demonstrates reliability for long-term implementation

Contribution and Implications

Provides regulators with an accurate early warning system to identify at-risk credit unions one year in advance
Demonstrates that just 5 key financial indicators can effectively predict credit union failure
Offers transparent, interpretable predictions that can guide intervention strategies
Helps protect the National Credit Union Share Insurance Fund through early identification of risks

Data Sources

Method Comparison Chart: Based on Table 4 comparing performance across different classification approaches
Feature Comparison Chart: Based on Table 4 comparing 44-feature vs 5-feature model performance
Feature Importance Chart: Based on Table 5 showing unbiased feature importance values
Performance Comparison Chart: Based on confusion matrix results in Table 4
Time Performance Chart: Based on test set results from July 2015 to September 2020 period