Applications of Machine Learning in Medical Prognosis Using Electronic Medical Records
Loading...
Date
Authors
Advisors
License
DOI
Type
dissertation
Journal Title
Journal ISSN
Volume Title
Publisher
Grantor
University of Wisconsin-Milwaukee
Abstract
Approximately 84 % of hospitals are adopting electronic medical records (EMR) In the United States. EMR is a vital resource to help clinicians diagnose the onset or predict the future condition of a specific disease. With machine learning advances, many research projects attempt to extract medically relevant and actionable data from massive EMR databases using machine learning algorithms. However, collecting patients' prognosis factors from Electronic EMR is challenging due to privacy, sensitivity, and confidentiality. In this study, we developed medical generative adversarial networks (GANs) to generate synthetic EMR prognosis factors using minimal information collected during routine care in specialized healthcare facilities. The generated prognosis variables used in developing predictive models for (1) chronic wound healing in patients diagnosed with Venous Leg Ulcers (VLUs) and (2) antibiotic resistance in patients diagnosed with Skin and soft tissue infections (SSTIs). Our proposed medical GANs, EMR-TCWGAN and DermaGAN, can produce both continuous and categorical features from EMR. We utilized conditional training strategies to enhance training and generate classified data regarding healing vs. non-healing in EMR-TCWGAN and susceptibility vs. resistance in DermGAN. The ability of the proposed GAN models to generate realistic EMR data was evaluated by TSTR (test on the synthetic, train on the real), discriminative accuracy, and visualization. We analyzed the synthetic data augmentation technique's practicality in improving the wound healing prognostic model and antibiotic resistance classifier. We achieved the area under the curve (AUC) of 0.875 in the wound healing prognosis model and an average AUC of 0.830 in the antibiotic resistance classifier by using the synthetic samples generated by GANs in the training process. These results suggest that GANs can be considered a data augmentation method to generate realistic EMR data.