The use of credit scorecard design, predictive modelling and text mining to detect fraud in the insurance industry
The use of analytical techniques for fraud detection and the design of fraud detection systems have been topics of several research projects in the past and have seen varying degrees of success in their practical implementation. In particular, several authors regard the use of credit risk scorecards for fraud detection as a useful analytical detection tool. However, research on analytical fraud detection for the South African insurance industry is limited. Furthermore, real world restrictions like the availability and quality of data elements, highly unbalanced datasets, interpretability challenges with complex analytical techniques and the evolving nature of insurance fraud contribute to the on-going challenge of detecting fraud successfully. Insurance organisations face financial instability from a global recession, tighter regulatory requirements and consolidation of the industry, which implore the need for a practical and effective fraud strategy. Given the volumes of structured and unstructured data available in data warehouses of insurance organisations, it would be sensible for an effective fraud strategy to take into account data-driven methods and incorporate analytical techniques into an overall fraud risk assessment system. Having said that, the complexity of the analytical techniques, coupled with the effort required to prepare the data to support it, should be carefully considered as some studies found that less complex algorithms produce equal or better results. Furthermore, an over reliance on analytical models can underestimate the underlying risk, as observed with credit risk at financial institutions during the financial crisis. An attractive property of the structure of the probabilistic weights-of-evidence (WOE) formulation for risk scorecard construction is its ability to handle data issues like missing values, outliers and rare cases. It is also transparent and flexible in allowing the re-adjustment of the bins based on expert knowledge or other business considerations. The approach proposed in the study is to construct fraud risk scorecards at entity level that incorporate sets of intrinsic and relational risk factors to support a robust fraud risk assessment. The study investigates the application of an integrated Suspicious Activity Assessment System (SAAS) empirically using real-world South African insurance data. The first case study uses a data sample of short-term insurance claims data and the second a data sample of life insurance claims data. Both case studies show promising results. The contributions of the study are summarised as follows: The study identified several challenges with the use of an analytical approach to fraud detection within the context of the South African insurance industry. The study proposes the development of fraud risk scorecards based on WOE measures for diagnostic fraud detection, within the context of the South African insurance industry, and the consideration of alternative algorithms to determine split points. To improve the discriminatory performance of the fraud risk scorecards, the study evaluated the use of analytical techniques, such as text mining, to identify risk factors. In order to identify risk factors from large sets of data, the study suggests the careful consideration of both the types of information as well as the types of statistical techniques in a fraud detection system. The types of information refer to the categories of input data available for analysis, translated into risk factors, and the types of statistical techniques refer to the constraints and assumptions of the underlying statistical techniques. In addition, the study advocates the use of an entity-focused approach to fraud detection, given that fraudulent activity typically occurs at an entity or group of entities level.
- Humanities