CAW-coref: Conjunction-Aware Word-level Coreference Resolution
October 09, 2023 ยท Entered Twilight ยท ๐ CRAC
Repo contents: LICENSE.md, README.md, calculate_conll.py, config.toml, convert_to_heads.py, convert_to_jsonlines.py, coref, get_conll_data.sh, get_third_party.sh, predict.py, release_weights.py, requirements.txt, run.py, sample_input.jsonlines
Authors
Karel D'Oosterlinck, Semere Kiros Bitew, Brandon Papineau, Christopher Potts, Thomas Demeester, Chris Develder
arXiv ID
2310.06165
Category
cs.CL: Computation & Language
Cross-listed
cs.AI
Citations
14
Venue
CRAC
Repository
https://github.com/KarelDO/wl-coref
โญ 9
Last Checked
2 months ago
Abstract
State-of-the-art coreference resolutions systems depend on multiple LLM calls per document and are thus prohibitively expensive for many use cases (e.g., information extraction with large corpora). The leading word-level coreference system (WL-coref) attains 96.6% of these SOTA systems' performance while being much more efficient. In this work, we identify a routine yet important failure case of WL-coref: dealing with conjoined mentions such as 'Tom and Mary'. We offer a simple yet effective solution that improves the performance on the OntoNotes test set by 0.9% F1, shrinking the gap between efficient word-level coreference resolution and expensive SOTA approaches by 34.6%. Our Conjunction-Aware Word-level coreference model (CAW-coref) and code is available at https://github.com/KarelDO/wl-coref.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
R.I.P.
๐ป
Ghosted
Language Models are Few-Shot Learners
R.I.P.
๐ป
Ghosted
RoBERTa: A Robustly Optimized BERT Pretraining Approach
R.I.P.
๐ป
Ghosted
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
R.I.P.
๐ป
Ghosted