Harnessing Advanced Data Analytics To Improve Early Detection And Diagnosis Of Rare Medical Conditions

Loading...
Thumbnail Image

License

DOI

Type

dissertation

Journal Title

Journal ISSN

Volume Title

Publisher

Grantor

University of Wisconsin-Milwaukee

Abstract

Healthcare experts and care providers continually seek innovative approaches to enhance care delivery. As a discipline, health has always been about enabling perfect care for everyone. With medicine shifting toward a future where value-based care becomes the norm, the role of early detection and intervention in health problems will become increasingly important. Most physicians are already aware that early detection and intervention can profoundly impact the health outcomes of any patient, especially those with diagnostically challenging diseases where timely management can prevent irreversible complications.Many health conditions have high mortality rates, not due to a lack of treatment options but because of ambiguous onset patterns and delayed diagnoses. Rare and multisystem diseases pose significant challenges, often leading patients through an odyssey of diagnosis, a prolonged and frustrating journey involving multiple doctor visits, extensive testing, misdiagnoses, and ineffective treatments before the correct condition is identified. For these patients, early and accurate detection can mean the difference between a manageable condition and one that significantly impairs their quality of life. Traditionally, doctors rely on extensive training, expertise, and experience to diagnose conditions. However, when medical diagnoses are difficult to achieve based on clinical information alone the inability to obtain quick and accurate answers from medical professionals can be a frustrating process. Recognizing these challenges, the healthcare sector is undergoing a transformation driven by its ability to record and analyze massive amounts of data. The rapid digitalization of healthcare has led to an exponential growth in Electronic Health Records (EHRs), imaging data, and patient-reported outcomes, collectively forming what is now referred to as Big Data in medicine. The sheer volume, variety, and velocity of medical data can empower healthcare providers with data-driven decision support systems, facilitating faster and more accurate diagnosis. However, making sense of this vast and complex information requires sophisticated analytical tools capable of identifying hidden patterns, correlations, and predictive markers. To this end, the goal of the two essays in this dissertation is to leverage advanced analytical methods when there is a need to deal with the breadth and complexity of information in early diagnosis. By designing, refining, and applying cutting-edge Machine Learning (ML) algorithms, integrating real-world medical data, and incorporating domain expertise, I aim to develop explainable and trustworthy data-driven tools that serve clinicians. In the first essay, I propose a data-driven machine learning framework to identify patients at high risk of developing Venous thromboembolism (VTE) before they undergo major hip or knee surgery. Leveraging electronic health records from over 390,000 patients who underwent major orthopedic surgery, I employed a genetic algorithm for guided feature selection and trained a fully connected deep neural network to identify high-risk patients for VTE development. My findings reveal several noteworthy insights. Traditional risk scoring tables, commonly used by physicians to assess high-risk patients, do not incorporate a comprehensive range of risk factors and are less effective than advanced machine learning techniques in differentiating between low- and high-risk individuals. Furthermore, this study identifies previously unrecognized risk factors for VTE, contributing to a broader understanding of disease prediction. The findings also offer practical considerations that may aid clinicians in optimizing VTE prophylaxis strategies. In the second essay, I propose a machine learning and network analytics approach for the preemptive identification of patients with autoimmune diseases (ADs). Given that (ADs) arise not just from individual causes but from complex interactions among several factors, the first phase of this study focuses on extracting a realistic and comprehensive understanding of the patient journey using network analytics techniques. In the second phase, the extracted knowledge from three comorbidity networks, covering more than 9,000,000 visits in our study cohort, is combined with medical record features to form an expanded input, which is then fed into several machine learning (ML) models for preclinical disease prediction. In the third step, we prioritize interpretability alongside predictive accuracy by employing Explainable Boosting Machines (EBMs), a modeling approach that helps approximate the behavior of our high-performing but less transparent classifiers. Results show that ML methods trained on these augmented features outperform previous methods trained on conventional features. The proposed model can be paired with different machine learning and deep learning classifiers to achieve high accuracy. The findings and insights from this study assist physicians in optimizing the timing of treatment administration, potentially maximizing efficacy while reducing adverse events associated with (ADs).

Description

Related Material and Data

Citation

Sponsorship

Endorsement

Review

Supplemented By

Referenced By