Credit Risk Under Regime Change
This project investigates how credit risk models behave when the underlying environment changes, focusing on the period surrounding the 2008 financial crisis. Using loan-level data from LendingClub (2007–2015 loan dataset), the objective is not only to predict default, but to understand how model outputs degrade when relationships between borrower characteristics and outcomes shift over time.
The dataset includes borrower and loan attributes such as credit grade, interest rate, debt-to-income ratio, income, loan amount, and term. These variables reflect how risk is assessed and priced within the system, linking model predictions to decisions around which loans appear attractive and which represent elevated exposure.
Two model classes were implemented and compared. Logistic regression was used in both regularised and unregularised forms to capture stable, aggregate relationships between features and default risk. Random forest models were used to capture non-linear interactions and local structure in the data, with hyperparameters tuned using Bayesian optimisation and validated through learning curves.
Evaluation was conducted using AUC-ROC, accuracy, and calibration analysis. Rather than relying on a static train-test split, a rolling window framework was used, where models are trained on recent historical data (approximately 30 days) and evaluated on forward horizons of one to five years. This allows performance to be tracked as conditions evolve.
Key components include:
- Designing a time-aware evaluation framework to assess model reliability under distribution shift
- Comparing linear and tree-based models in terms of predictive performance, stability, and calibration
- Evaluating how predicted default probabilities align with realised outcomes over time
- Tracking shifts in feature importance to identify changing drivers of risk
- Assessing the tradeoff between model flexibility and robustness in non-stationary environments
Results show a clear divergence in model behaviour. Random forests achieve strong performance in stable, pre-crisis conditions, capturing complex interactions within the data. However, performance deteriorates significantly, with reduced AUC and instability in predicted probabilities following the 2008 shift, with reduced generalisation and instability in predicted probabilities. Logistic regression exhibits lower peak performance but maintains more consistent behaviour, with smaller degradation and more reliable calibration.
This distinction is important because model outputs are often used as signals to guide decisions. A model that performs well historically but fails under changing conditions can produce misleading signals, leading to overconfidence in patterns that no longer hold. In contrast, a more stable model may provide weaker signals in normal conditions, but remain usable when the environment shifts.
The difference arises from how each model represents structure. Tree-based models learn fine-grained partitions tied to historical patterns, which become misaligned when the regime changes. Logistic regression imposes a global structure, capturing broader relationships such as the link between leverage, pricing, and default risk, making it more robust to shifts in underlying conditions.
Feature importance analysis further shows that key variables such as loan grade and interest rate change in relevance over time. These features are themselves influenced by the environment, meaning that the mapping from inputs to outcomes evolves, limiting the effectiveness of models trained on past data.
The project highlights a central challenge in modelling real-world systems: predictive performance depends not only on model choice, but on the stability of the environment. In settings where relationships shift, robustness and consistency can matter more than maximising accuracy under historical conditions.