Text Mining of Patient Demographics and Diagnoses from Psychiatric Assessments

dc.contributor.advisorRashmi Prasad
dc.contributor.committeememberTimothy Patrick
dc.contributor.committeememberRohit Kate
dc.creatorKlosterman, Eric James
dc.date.accessioned2025-01-16T19:44:54Z
dc.date.available2025-01-16T19:44:54Z
dc.date.issued2014-12-01
dc.description.abstractAutomatic extraction of patient demographics and psychiatric diagnoses from clinical notes allows for the collection of patient data on a large scale. This data could be used for a variety of research purposes including outcomes studies or developing clinical trials. However, current research has not yet discussed the automatic extraction of demographics and psychiatric diagnoses in detail. The aim of this study is to apply text mining to extract patient demographics - age, gender, marital status, education level, and admission diagnoses from the psychiatric assessments at a mental health hospital and also assign codes to each category. Gender is coded as either Male or Female, marital status is coded as either Single, Married, Divorced, or Widowed, and education level can be coded starting with Some High School through Graduate Degree (PhD/JD/MD etc. Level). Classifications for diagnoses are based on the DSM-IV. For each category, a rule-based approach was developed utilizing keyword-based regular expressions as well as constituency trees and typed dependencies. We employ a two-step approach that first maximizes recall through the development of keyword-based patterns and if necessary, maximizes precision by using NLP-based rules to handle the problem of ambiguity. To develop and evaluate our method, we annotated a corpus of 200 assessments, using a portion of the corpus for developing the method and the rest as a test set. F-score was satisfactory for each category (Age: 0.997; Gender: 0.989; Primary Diagnosis: 0.983; Marital Status: 0.875; Education Level: 0.851) as was coding accuracy (Age: 1.0; Gender: 0.989; Primary Diagnosis: 0.922; Marital Status: 0.889; Education Level: 0.778). These results indicate that a rule-based approach could be considered for extracting these types of information in the psychiatric field. At the same time, the results showed a drop in performance from the development set to the test set, which is partly due to the need for more generality in the rules developed.
dc.identifier.urihttp://digital.library.wisc.edu/1793/88482
dc.relation.replaceshttps://dc.uwm.edu/etd/613
dc.subjectInformation Extraction
dc.subjectPatient Demographics
dc.subjectPatient Psychiatric Diagnoses
dc.subjectPsychology
dc.subjectText Mining
dc.titleText Mining of Patient Demographics and Diagnoses from Psychiatric Assessments
dc.typethesis
thesis.degree.disciplineHealth Care Informatics
thesis.degree.grantorUniversity of Wisconsin-Milwaukee
thesis.degree.nameMaster of Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Klosterman_uwm_0263m_10906.pdf
Size:
1.09 MB
Format:
Adobe Portable Document Format
Description:
Main File