EIT: Enhanced Interactive Transformer

December 20, 2022 ยท Declared Dead ยท ๐Ÿ› Annual Meeting of the Association for Computational Linguistics

๐Ÿ“œ CAUSE OF DEATH: Death by README
Repo has only a README

Repo contents: README.md

Authors Tong Zheng, Bei Li, Huiwen Bao, Tong Xiao, Jingbo Zhu arXiv ID 2212.10197 Category cs.CL: Computation & Language Citations 3 Venue Annual Meeting of the Association for Computational Linguistics Repository https://github.com/zhengkid/EIT-Enhanced-Interactive-Transformer โญ 1 Last Checked 1 month ago
Abstract
Two principles: the complementary principle and the consensus principle are widely acknowledged in the literature of multi-view learning. However, the current design of multi-head self-attention, an instance of multi-view learning, prioritizes the complementarity while ignoring the consensus. To address this problem, we propose an enhanced multi-head self-attention (EMHA). First, to satisfy the complementary principle, EMHA removes the one-to-one mapping constraint among queries and keys in multiple subspaces and allows each query to attend to multiple keys. On top of that, we develop a method to fully encourage consensus among heads by introducing two interaction models, namely inner-subspace interaction and cross-subspace interaction. Extensive experiments on a wide range of language tasks (e.g., machine translation, abstractive summarization and grammar correction, language modeling), show its superiority, with a very modest increase in model size. Our code would be available at: https://github.com/zhengkid/EIT-Enhanced-Interactive-Transformer.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Computation & Language

๐ŸŒ… ๐ŸŒ… Old Age

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, ... (+6 more)

cs.CL ๐Ÿ› NeurIPS ๐Ÿ“š 166.0K cites 8 years ago

Died the same way โ€” ๐Ÿ“œ Death by README