Using the Distance in Logistic Regression Models for Predictor Ranking in Diabetes Detection

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Springer

Access Rights

info:eu-repo/semantics/closedAccess

Abstract

Logistic regression is widely used to model the relationship between a response variable and multiple independent variables. In practice, the most important variables for each problem domain are generally well known. However, a wealth of ongoing studies has been exploring additional variables for improving the prediction performance using an enriched model. In this article, a new method for ranking binary independent variables is suggested based on the distance between two decision boundaries. The boundaries correspond to the cases when value of the variable is zero or one. It is shown that, using age and body mass index as the base variables for diabetes prediction, the distances mentioned above are effective for ranking additional variables, leading to better scores than several conventionally used approaches.

Description

International Conference on Medical and Biological Engineering in Bosnia and Herzegovina (CMBEBIH) -- MAY 16-18, 2019 -- Banja Luka, BOSNIA & HERCEG

Keywords

Logistic regression, Feature selection, Binary predictors, Decision boundary, Diabetes prediction

Journal or Series

Proceedings of the International Conference on Medical and Biological Engineering, Cmbebih 2019

WoS Q Value

Scopus Q Value

Volume

73

Issue

Citation

Endorsement

Review

Supplemented By

Referenced By