NWU Institutional Repository

The impact of pre-selected variance inflation factor thresholds on the stability and predictive power of logistic regression models in credit scoring

dc.contributor.authorDe Jongh, P.J.
dc.contributor.authorDe Jongh, E.
dc.contributor.authorPienaar, M.
dc.contributor.authorGordon-Grant, H.
dc.contributor.authorOberholzer, M.
dc.contributor.authorSantana, L.
dc.contributor.researchID11749318 - De Jongh, Pieter Juriaan
dc.contributor.researchID21139032 - De Jongh, Erika
dc.contributor.researchID11803371 - Santana, Leonard
dc.date.accessioned2016-08-31T06:38:29Z
dc.date.available2016-08-31T06:38:29Z
dc.date.issued2015
dc.description.abstractStandard Bank, South Africa, currently employs a methodology when developing application or behavioural scorecards that involves logistic regression. A key aspect of building logistic regression models entails variable selection which involves dealing with multicollinearity. The objective of this study was to investigate the impact of using different variance inflation factor (VIF) thresholds on the performance of these models in a predictive and discriminatory context and to study the stability of the estimated coefficients in order to advise the bank. The impact of the choice of VIF thresholds was researched by means of an empirical and simulation study. The empirical study involved analysing two large data sets that represent the typical size encountered in a retail credit scoring context. The first analysis concentrated on fitting the various VIF models and comparing the fitted models in terms of the stability of coefficient estimates and goodness-of-fit statistics while the second analysis focused on evaluating the fitted models' predictive ability over time. The simulation study was used to study the effect of multicollinearity in a controlled setting. All the above-mentioned studies indicate that the presence of multicollinearity in large data sets is of much less concern than in small data sets and that the VIF criterion could be relaxed considerably when models are fitted to large data sets. The recommendations in this regard have been accepted and implemented by Standard Banken_US
dc.description.sponsorshipNational Research Foundation (NRF) of South Africa reference number (UID: TP1207243988)en_US
dc.identifier.citationDe Jongh, P.J. et al. 2015. The impact of pre-selected variance inflation factor thresholds on the stability and predictive power of logistic regression models in credit scoring. Orion, 32(1):17-37. [http://dx.doi.org/10.5784/31-1-162]en_US
dc.identifier.issn0259-191X
dc.identifier.issn2224-0004 (Online)
dc.identifier.urihttp://hdl.handle.net/10394/18458
dc.identifier.urihttp://dx.doi.org/10.5784/31-1-162
dc.identifier.urihttp://orion.journals.ac.za/pub/article/download/162/449
dc.language.isoenen_US
dc.publisherORSSA (Operations Research Society of South Africa)en_US
dc.subjectLogistic regressionen_US
dc.subjectmulticollinearityen_US
dc.subjectvariance inflation factoren_US
dc.subjectvariation of coeffcient estimatesen_US
dc.subjectelastic neten_US
dc.subjectprediction and discriminatory poweren_US
dc.subjectlarge credit scoring data setsen_US
dc.subjectrisk analysisen_US
dc.titleThe impact of pre-selected variance inflation factor thresholds on the stability and predictive power of logistic regression models in credit scoringen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
The impact of pre-selected.pdf
Size:
1009.81 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.61 KB
Format:
Item-specific license agreed upon to submission
Description: