| 1 |
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin, Ming-Wei Chang, ... (+2 more)
|
🌅
Old Age
|
cs.CL
|
110.2K |
7 years ago |
| 2 |
"Why Should I Trust You?": Explaining the Predictions of Any Classifier
Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin
|
👻
Ghosted
|
cs.LG
|
20.3K |
10 years ago |
| 3 |
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
Mike Lewis, Yinhan Liu, ... (+6 more)
|
👻
Ghosted
|
cs.CL
|
12.3K |
6 years ago |
| 4 |
Deep contextualized word representations
Matthew E. Peters, Mark Neumann, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
12.0K |
8 years ago |
| 5 |
Enriching Word Vectors with Subword Information
Piotr Bojanowski, Edouard Grave, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
10.5K |
9 years ago |
| 6 |
Neural Machine Translation of Rare Words with Subword Units
Rico Sennrich, Barry Haddow, Alexandra Birch
|
👻
Ghosted
|
cs.CL
|
8.5K |
10 years ago |
| 7 |
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau, Kartikay Khandelwal, ... (+8 more)
|
👻
Ghosted
|
cs.CL
|
7.9K |
6 years ago |
| 8 |
Bag of Tricks for Efficient Text Classification
Armand Joulin, Edouard Grave, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
4.9K |
9 years ago |
| 9 |
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference
Adina Williams, Nikita Nangia, Samuel R. Bowman
|
👻
Ghosted
|
cs.CL
|
4.9K |
9 years ago |
| 10 |
Get To The Point: Summarization with Pointer-Generator Networks
Abigail See, Peter J. Liu, Christopher D. Manning
|
👻
Ghosted
|
cs.CL
|
4.3K |
9 years ago |
| 11 |
Neural Architectures for Named Entity Recognition
Guillaume Lample, Miguel Ballesteros, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
4.2K |
10 years ago |
| 12 |
HellaSwag: Can a Machine Really Finish Your Sentence?
Rowan Zellers, Ari Holtzman, ... (+3 more)
|
🌅
Old Age
|
cs.CL
|
3.7K |
6 years ago |
| 13 |
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
Myle Ott, Sergey Edunov, ... (+6 more)
|
👻
Ghosted
|
cs.CL
|
3.3K |
7 years ago |
| 14 |
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
Kai Sheng Tai, Richard Socher, Christopher D. Manning
|
👻
Ghosted
|
cs.CL
|
3.2K |
11 years ago |
| 15 |
Know What You Don't Know: Unanswerable Questions for SQuAD
Pranav Rajpurkar, Robin Jia, Percy Liang
|
👻
Ghosted
|
cs.CL
|
3.2K |
7 years ago |
| 16 |
Energy and Policy Considerations for Deep Learning in NLP
Emma Strubell, Ananya Ganesh, Andrew McCallum
|
👻
Ghosted
|
cs.CL
|
3.1K |
6 years ago |
| 17 |
mT5: A massively multilingual pre-trained text-to-text transformer
Linting Xue, Noah Constant, ... (+6 more)
|
👻
Ghosted
|
cs.CL
|
3.0K |
5 years ago |
| 18 |
Improving Neural Machine Translation Models with Monolingual Data
Rico Sennrich, Barry Haddow, Alexandra Birch
|
👻
Ghosted
|
cs.CL
|
2.9K |
10 years ago |
| 19 |
End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF
Xuezhe Ma, Eduard Hovy
|
👻
Ghosted
|
cs.LG
|
2.8K |
10 years ago |
| 20 |
Self-Attention with Relative Position Representations
Peter Shaw, Jakob Uszkoreit, Ashish Vaswani
|
👻
Ghosted
|
cs.CL
|
2.7K |
8 years ago |
| 21 |
A Diversity-Promoting Objective Function for Neural Conversation Models
Jiwei Li, Michel Galley, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
2.6K |
10 years ago |
| 22 |
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
Alon Talmor, Jonathan Herzig, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
2.2K |
7 years ago |
| 23 |
Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation
Melvin Johnson, Mike Schuster, ... (+10 more)
|
👻
Ghosted
|
cs.CL
|
2.2K |
9 years ago |
| 24 |
Reading Wikipedia to Answer Open-Domain Questions
Danqi Chen, Adam Fisch, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
2.2K |
9 years ago |
| 25 |
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Christopher Clark, Kenton Lee, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
2.1K |
6 years ago |
| 26 |
SpanBERT: Improving Pre-training by Representing and Predicting Spans
Mandar Joshi, Danqi Chen, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
2.1K |
6 years ago |
| 27 |
FEVER: a large-scale dataset for Fact Extraction and VERification
James Thorne, Andreas Vlachos, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
2.0K |
8 years ago |
| 28 |
Named Entity Recognition with Bidirectional LSTM-CNNs
Jason P. C. Chiu, Eric Nichols
|
👻
Ghosted
|
cs.CL
|
2.0K |
10 years ago |
| 29 |
OpenNMT: Open-source Toolkit for Neural Machine Translation
Guillaume Klein, Yoon Kim, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
1.9K |
8 years ago |
| 30 |
Hierarchical Neural Story Generation
Angela Fan, Mike Lewis, Yann Dauphin
|
👻
Ghosted
|
cs.CL
|
1.9K |
7 years ago |
| 31 |
Multimodal Transformer for Unaligned Multimodal Language Sequences
Yao-Hung Hubert Tsai, Shaojie Bai, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
1.9K |
6 years ago |
| 32 |
BERT Rediscovers the Classical NLP Pipeline
Ian Tenney, Dipanjan Das, Ellie Pavlick
|
👻
Ghosted
|
cs.CL
|
1.7K |
6 years ago |
| 33 |
How to Fine-Tune BERT for Text Classification?
Chi Sun, Xipeng Qiu, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
1.7K |
6 years ago |
| 34 |
DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation
Yizhe Zhang, Siqi Sun, ... (+7 more)
|
👻
Ghosted
|
cs.CL
|
1.7K |
6 years ago |
| 35 |
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them
Mirac Suzgun, Nathan Scales, ... (+9 more)
|
💤
Eternal Rest
|
cs.CL
|
1.7K |
3 years ago |
| 36 |
Personalizing Dialogue Agents: I have a dog, do you have pets too?
Saizheng Zhang, Emily Dinan, ... (+4 more)
|
👻
Ghosted
|
cs.AI
|
1.6K |
8 years ago |
| 37 |
How multilingual is Multilingual BERT?
Telmo Pires, Eva Schlinger, Dan Garrette
|
👻
Ghosted
|
cs.CL
|
1.6K |
6 years ago |
| 38 |
"Liar, Liar Pants on Fire": A New Benchmark Dataset for Fake News Detection
William Yang Wang
|
👻
Ghosted
|
cs.CL
|
1.6K |
8 years ago |
| 39 |
Incorporating Copying Mechanism in Sequence-to-Sequence Learning
Jiatao Gu, Zhengdong Lu, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
1.6K |
10 years ago |
| 40 |
Attention is not Explanation
Sarthak Jain, Byron C. Wallace
|
🌅
Old Age
|
cs.CL
|
1.6K |
7 years ago |
| 41 |
Neural Network Acceptability Judgments
Alex Warstadt, Amanpreet Singh, Samuel R. Bowman
|
👻
Ghosted
|
cs.CL
|
1.6K |
7 years ago |
| 42 |
Language (Technology) is Power: A Critical Survey of "Bias" in NLP
Su Lin Blodgett, Solon Barocas, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
1.5K |
5 years ago |
| 43 |
Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering
Gautier Izacard, Edouard Grave
|
👻
Ghosted
|
cs.CL
|
1.5K |
5 years ago |
| 44 |
MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations
Soujanya Poria, Devamanyu Hazarika, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
1.4K |
7 years ago |
| 45 |
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Elena Voita, David Talbot, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
1.4K |
6 years ago |
| 46 |
CoQA: A Conversational Question Answering Challenge
Siva Reddy, Danqi Chen, Christopher D. Manning
|
🌅
Old Age
|
cs.CL
|
1.3K |
7 years ago |
| 47 |
Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
Taku Kudo
|
👻
Ghosted
|
cs.CL
|
1.3K |
7 years ago |
| 48 |
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
Marco Tulio Ribeiro, Tongshuang Wu, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
1.3K |
5 years ago |
| 49 |
End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures
Makoto Miwa, Mohit Bansal
|
👻
Ghosted
|
cs.CL
|
1.2K |
10 years ago |
| 50 |
Annotation Artifacts in Natural Language Inference Data
Suchin Gururangan, Swabha Swayamdipta, ... (+4 more)
|
👻
Ghosted
|
cs.CL
|
1.2K |
8 years ago |