Learning for clinical named entity recognition without manual annotations

Ghiasvand, Omid; Kate, Rohit J.

Learning for clinical named entity recognition without manual annotations

Files

Primary Article.pdf (505.59 KB)

Date

2018-10-30

Authors

Ghiasvand, Omid

Kate, Rohit J.

Type

article

Abstract

Background: Named entity recognition (NER) systems are commonly built using supervised methods that use machine learning to learn from corpora manually annotated with named entities. However, manually annotating corpora is very expensive and laborious. Materials and methods: In this paper, a novel method is presented for training clinical NER systems that does not require any manual annotations. It only requires a raw text corpus and a resource like UMLS that can give a list of named entities along with their semantic types. Using these two resources, annotations are automatically obtained to train machine learning methods. The method was evaluated on the NER shared-task datasets of i2b2 2010 and SemEval 2014. Results: On the SemEval 2014 dataset for recognizing diseases and disorders, the method obtained F-measure of 0.693 for exact matching and of 0.773 allowing overlaps. This is comparable to many supervised systems in the past that had used manual annotations for training. On the i2b2 2010 dataset for recognizing problems, tests and treatments, the method obtained F-measures of 0.451, 0.338 and 0.204 respectively for exact matching and of 0.721, 0.587 and 0.475 respectively allowing overlaps. These results are better than an existing unsupervised method. Conclusions: Experiments on standard datasets showed that the new method performed well. The method is general and could be applied to recognize entities of other types on other genres of text without needing manual annotations.

Keywords

Named entity recognition, Clinical text, Machine learning, Natural language processing, Information extraction

Citation

Ghiasvand, O., & Kate, R. J. (2018). Learning for clinical named entity recognition without manual annotations, Informatics in Medicine Unlocked, 2018, 13, 122-127, https://doi.org/10.1016/j.imu.2018.10.011

URI

http://digital.library.wisc.edu/1793/85020

Collections

Health Informatics and Administration Faculty Publications

Full item page

Learning for clinical named entity recognition without manual annotations

Files

Date

Authors

Advisors

License

DOI

Type

Journal Title

Journal ISSN

Volume Title

Publisher

Grantor

Abstract

Description

Keywords

Related Material and Data

Citation

Sponsorship

URI

Collections

Endorsement

Review

Supplemented By

Referenced By