Disease Name Extraction from Clinical Text Using Conditional Random Fields

Ghiasvand, Omid

Disease Name Extraction from Clinical Text Using Conditional Random Fields

Files

Primary Ghiasvand_uwm_0263m_10703.pdf (685.33 KB)

Date

2014-05-01

Authors

Ghiasvand, Omid

Advisors

Rohit J. Kate

Type

thesis

Grantor

University of Wisconsin-Milwaukee

Abstract

The aim of the research done in this thesis was to extract disease and disorder names from clinical texts. We utilized Conditional Random Fields (CRF) as the main method to label diseases and disorders in clinical sentences. We used some other tools such as MetaMap and Stanford Core NLP tool to extract some crucial features. MetaMap tool was used to identify names of diseases/disorders that are already in UMLS Metathesaurus. Some other important features such as lemmatized versions of words, and POS tags were extracted using the Stanford Core NLP tool. Some more features were extracted directly from UMLS Metathesaurus, including semantic types of words. We participated in the SemEval 2014 competition's Task 7 and used its provided data to train and evaluate our system. Training data contained 199 clinical texts, development data contained 99 clinical texts, and the test data contained 133 clinical texts, these included discharge summaries, echocardiogram, radiology, and ECG reports. We obtained competitive results on the disease/disorder name extraction task. We found through ablation study that while all features contributed, MetaMap matches, POS tags, and previous and next words were the most effective features.

Keywords

Clinical Text, Conditional Random Fields, Metamap, Named Entity Recognition, Natural Language Processing, UMLS

URI

http://digital.library.wisc.edu/1793/88350

Collections

UW Milwaukee Electronic Theses and Dissertations

Full item page

Disease Name Extraction from Clinical Text Using Conditional Random Fields

Files

Date

Authors

Advisors

License

DOI

Type

Journal Title

Journal ISSN

Volume Title

Publisher

Grantor

Abstract

Description

Keywords

Related Material and Data

Citation

Sponsorship

URI

Collections

Endorsement

Review

Supplemented By

Referenced By