Using Distributed Representations to Disambiguate Biomedical and Clinical Concepts

August 19, 2016 · Entered Twilight · 🏛 BioNLP@ACL

"Last commit was 7.0 years ago (≥5 year threshold)"

Evidence collected by the PWNC Scanner

Repo contents: .gitignore, LICENSE, README.md, experiment_1.py, requirements.txt, sample_data, yarn

Authors Stéphan Tulkens, Simon Šuster, Walter Daelemans arXiv ID 1608.05605 Category cs.CL: Computation & Language Citations 27 Venue BioNLP@ACL Repository https://github.com/clips/yarn ⭐ 14 Last Checked 4 months ago

Abstract

In this paper, we report a knowledge-based method for Word Sense Disambiguation in the domains of biomedical and clinical text. We combine word representations created on large corpora with a small number of definitions from the UMLS to create concept representations, which we then compare to representations of the context of ambiguous terms. Using no relational information, we obtain comparable performance to previous approaches on the MSH-WSD dataset, which is a well-known dataset in the biomedical domain. Additionally, our method is fast and easy to set up and extend to other domains. Supplementary materials, including source code, can be found at https: //github.com/clips/yarn