NWU Institutional Repository

A semi-supervised segmentation algorithm as applied to k-means using information value

dc.contributor.authorBreed, D.G.
dc.contributor.authorVerster, T.
dc.contributor.authorTerblanche, S.E.
dc.contributor.researchID10794549 - Terblanche, Stephanus Esias
dc.contributor.researchID10943587 - Verster, Tanja
dc.contributor.researchID12242950 - Breed, Douw Gerbrand
dc.date.accessioned2018-06-07T06:10:52Z
dc.date.available2018-06-07T06:10:52Z
dc.date.issued2017
dc.description.abstractSegmentation (or partitioning) of data for the purpose of enhancing predictive modelling is a well-established practice in the banking industry. Unsupervised and supervised approaches are the two main streams of segmentation and examples exist where the application of these techniques improved the performance of predictive models. Both these streams focus, however, on a single aspect (i.e. either target separation or independent variable distribution) and combining them may deliver better results in some instances. In this paper a semi-supervised segmentation algorithm is presented, which is based on k-means clustering and which applies information value for the purpose of informing the segmentation process. Simulated data are used to identify a few key characteristics that may cause one segmentation technique to outperform another. In the empirical study the newly proposed semi-supervised segmentation algorithm outperforms both an unsupervised and a supervised segmentation technique, when compared by using the Gini coefficient as performance measure of the resulting predictive modelsen_US
dc.identifier.citationBreed, D.G. et al. 2017. A semi-supervised segmentation algorithm as applied to k-means using information value. Orion, 33(2):85-103. [http://dx.doi.org/10.5784/33-2-568]en_US
dc.identifier.issn2224-0004
dc.identifier.issn0259-191X (Online)
dc.identifier.urihttp://hdl.handle.net/10394/27353
dc.identifier.urihttp://dx.doi.org/10.5784/33-2-568
dc.identifier.urihttp://orion.journals.ac.za/pub/article/view/568/468
dc.language.isoenen_US
dc.publisherORSSAen_US
dc.subjectBankingen_US
dc.subjectClusteringen_US
dc.subjectMultivariate statisticsen_US
dc.subjectData miningen_US
dc.titleA semi-supervised segmentation algorithm as applied to k-means using information valueen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
A_semi_supervised.pdf
Size:
431.47 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.61 KB
Format:
Item-specific license agreed upon to submission
Description: