Cost-Sensitive Learning for Imbalanced Bad Debt Datasets in Healthcare Industry

Shi, D.

Please use this identifier to cite or link to this item: http://dspace.utpl.edu.ec/handle/123456789/18864

Title:	Cost-Sensitive Learning for Imbalanced Bad Debt Datasets in Healthcare Industry
Authors:	Shi, D.
Keywords:	bad debt recovey cost-sensitive imbalanced semi-supervised learning
Issue Date:	1-Oct-2015
Publisher:	Proceedings - 2015 Asia-Pacific Conference on Computer-Aided System Engineering, APCASE 2015
Abstract:	The research using computational intelligence methods to improve bad debt recovery is imperative due to the rapid increase in the cost of healthcare in the U.S. This study explores effectiveness of using cost-sensitive learning methods to classify the unknown cases in imbalanced bad debt datasets and compares the results with those of two other methods: undersampling and oversampling, often used in processing imbalanced datasets. The study also analyzes the function of a semi-supervised learning algorithm in different circumstances. The results show that although the predictive accuracy rates with oversampling in balanced testing datasets is the best, it is unpractical due to the existence of imbalanced classes in real healthcare situations. The models constructed by undersampling have high classification accuracy rates of the minority class in imbalanced datasets, but they tend to make the overall classification accuracy rates of the majority class worse. The results show that cost-sensitive learning methods can improve the classification accuracy rates of the minority class in imbalanced datasets while achieving considerably good overall classification accuracy rates and classification accuracy rates of majority class. The results and analysis in this study show that cost-sensitive learning methods provide a potentially viable approach to classify the unknown cases in imbalanced bad debt datasets. At last, more practical predictive results are obtained by using the models to predict the unlabeled cases. Although oversampling and the cost-sensitive learning methods with the semi-supervised learning can improve the overall and majority class classification accuracy rates, the minority class classification accuracy rates are still relatively low. The semi-supervised learning algorithms need to be improved to adapt to the imbalanced bad debt datasets.
Identifier :	10.1109/APCASE.2015.13
URI:	http://dspace.utpl.edu.ec/handle/123456789/18864
ISSN:	9.78E+17
Other Identifiers:	10.1109/APCASE.2015.13
Type:	Article
Appears in Collections:	Artículos de revistas Científicas

Files in This Item:

63ece547-c2b3-4181-8fd5-b3693a5e9c94

Show full item record