Title

A Preliminary Investigation of Decision Tree Models for Classification Accuracy Rates and Extracting Interpretable Rules in the Credit Scoring Task: A Case of the German Data Set

Document Type

Journal Article

Publisher

The Clute Institute For Academic Research, Colorado, USA

Faculty

Computing, Health and Science

School

Computer and Security Science

RAS ID

6093

Comments

This article was originally published as: Zurada, J., & Lam, C. P. (2008). A preliminary investigation of decision tree models for classification accuracy rates and extracting Interpretable rules in the credit scoring task: a case of the German data set. Review of Business Information Systems, 12(3), 45-54. Original article available here

Abstract

For many years lenders have been using traditional statistical techniques such as logistic regression and discriminant analysis to more precisely distinguish between creditworthy customers who are granted loans and non-creditworthy customers who are denied loans. More recently new machine learning techniques such as neural networks, decision trees, and support vector machines have been successfully employed to classify loan applicants into those who are likely to pay a loan off or default upon a loan. Accurate classification is beneficial to lenders in terms of increased financial profits or reduced losses and to loan applicants who can avoid overcommitment. This paper examines a historical data set from consumer loans issued by a German bank to individuals whom the bank considered to be qualified customers. The data set consists of the financial attributes of each customer and includes a mixture of loans that the customers paid off or defaulted upon. The paper examines and compares the classification accuracy rates of three decision tree techniques as well as analyzes their ability to generate easy to understand rules.