dc.description.abstract |
Systemic Lupus Erythematosus (SLE) is an autoimmune disease with unknown causes
and no current cure. While Lupus Low Disease Activity State (LLDAS), an attainable
treat-to-target goal in SLE, has been associated with reduced damage accrual
and decreased mortality risk, the number of deaths remains significantly high. Among
of these deaths have been found to be influenced by demographic and clinical factors
such as race, sex, infection, and disease activity. Most studies conducted in
SLE were statistical analyses and machine learning approach seems to be very limited
on the topic. On the other hand, machine learning have been widely utilized
in modern healthcare for various disease prediction studies. Additionally, the Asia
Pacific Lupus Collaboration (APLC) cohort provides a dataset that has been commonly
included in SLE works. Hence, this study proposes the use of machine learning
in creating a prediction system for mortality risk in SLE patients. Label Encoder,
Ordinal Encoder, One Hot Encoder, Single Imputation, and Multiple Imputation by
Chained Equations (MICE) were applied to create the imputed dataset. Synthetic
Minority Oversampling Technique (SMOTE), Recursive Feature Elimination with
Cross-Validation (RFECV), and Standard Scaler were further applied to produce 15
more dataset variations. Random Forest, XGBoost, Support Vector Machine, and
Logistic Regression were trained on the 16 datasets—developing a total of 64 models.
Using AUROC as the main metric, results have shown that the XGBoost configured
on the SMOTE dataset was the best performing model with an AUROC of 85.1%.
Integrating Local Interpretable Model-agnostic Explanations (LIME) with the best
XGBoost, a web application was built that allows a user to input real patient health
data and view the mortality risk prediction outcome with explanations firsthand. |
en_US |