ATR4S: Toolkit with State-of-the-art Automatic Terms Recognition Methods in Scala
November 23, 2016 ยท Declared Dead ยท ๐ Language Resources and Evaluation
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
N. Astrakhantsev
arXiv ID
1611.07804
Category
cs.CL: Computation & Language
Citations
52
Venue
Language Resources and Evaluation
Last Checked
4 months ago
Abstract
Automatically recognized terminology is widely used for various domain-specific texts processing tasks, such as machine translation, information retrieval or sentiment analysis. However, there is still no agreement on which methods are best suited for particular settings and, moreover, there is no reliable comparison of already developed methods. We believe that one of the main reasons is the lack of state-of-the-art methods implementations, which are usually non-trivial to recreate. In order to address these issues, we present ATR4S, an open-source software written in Scala that comprises more than 15 methods for automatic terminology recognition (ATR) and implements the whole pipeline from text document preprocessing, to term candidates collection, term candidates scoring, and finally, term candidates ranking. It is highly scalable, modular and configurable tool with support of automatic caching. We also compare 10 state-of-the-art methods on 7 open datasets by average precision and processing time. Experimental comparison reveals that no single method demonstrates best average precision for all datasets and that other available tools for ATR do not contain the best methods.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
๐
๐
Old Age
XLNet: Generalized Autoregressive Pretraining for Language Understanding
๐ฎ
๐ฎ
The Ethereal
Effective Approaches to Attention-based Neural Machine Translation
๐
๐
Old Age
A large annotated corpus for learning natural language inference
๐
๐
Old Age
HellaSwag: Can a Machine Really Finish Your Sentence?
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted