Word Alignment in the Era of Deep Learning: A Tutorial

November 30, 2022 · The Cartographer · 🏛 arXiv.org

"No code URL or promise found in abstract"
"Title-pattern auto-detect: Word Alignment in the Era of Deep Learning: A Tutorial"

Evidence collected by the PWNC Scanner

Authors Bryan Li arXiv ID 2212.00138 Category cs.CL: Computation & Language Citations 5 Venue arXiv.org Last Checked 3 days ago

Abstract

The word alignment task, despite its prominence in the era of statistical machine translation (SMT), is niche and under-explored today. In this two-part tutorial, we argue for the continued relevance for word alignment. The first part provides a historical background to word alignment as a core component of the traditional SMT pipeline. We zero-in on GIZA++, an unsupervised, statistical word aligner with surprising longevity. Jumping forward to the era of neural machine translation (NMT), we show how insights from word alignment inspired the attention mechanism fundamental to present-day NMT. The second part shifts to a survey approach. We cover neural word aligners, showing the slow but steady progress towards surpassing GIZA++ performance. Finally, we cover the present-day applications of word alignment, from cross-lingual annotation projection, to improving translation.