A Machine Learning Pipeline with Switching Algorithms to Predict Lung Cancer and Identify Top Features

dc.contributor.advisorTian Zhao
dc.contributor.advisorJake Luo
dc.contributor.committeememberCristine Cheng
dc.creatorTasnim, Anika
dc.date.accessioned2025-01-16T18:38:48Z
dc.date.issued2021-08-01
dc.description.abstractLung cancer is the leading cause of cancer-related death around the world. Early detection is a critical factor for its effective treatment. To facilitate early-stage prediction, a Machine Learning (ML) pipeline has been built that uses inpatient admission data to train four ML models. The data is dynamically loaded into a database, cleaned, and passed through the SelectKBest selector to identify the top features influencing the prognosis, which are then fed into the pipeline and fitted to the ML models to make the forecast. Among the models used, Decision Tree provides the highest accuracy (97.09%), followed by Random Forest (94.07%). MLP and Logistic Regression reach an accuracy of 84.58% and 77.65% respectively. Some of the top 50 features include chronic obstructive pulmonary disease, pleural effusion, secondary and unspecified malignant neoplasm of intrathoracic lymph nodes, syndrome of inappropriate secretion of antidiuretic hormone, and neoplasm-related acute, chronic pain.
dc.description.embargo2023-06-30
dc.embargo.liftdate2023-06-30
dc.identifier.urihttp://digital.library.wisc.edu/1793/87198
dc.relation.replaceshttps://dc.uwm.edu/etd/2738
dc.subjectDeep Learning
dc.subjectLung cancer
dc.subjectMachine Learning Pipeline
dc.subjectSupervised Machine Learning
dc.subjectTop features influencing lung cancer
dc.titleA Machine Learning Pipeline with Switching Algorithms to Predict Lung Cancer and Identify Top Features
dc.typethesis
thesis.degree.disciplineComputer Science
thesis.degree.grantorUniversity of Wisconsin-Milwaukee
thesis.degree.nameMaster of Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Tasnim_uwm_0263M_13052.pdf
Size:
1.61 MB
Format:
Adobe Portable Document Format
Description:
Main File