| 151 |
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
Pratik Chaudhari, Anna Choromanska, ... (+7 more)
|
👻
Ghosted
|
cs.LG
|
847 |
9 years ago |
| 152 |
Mass-Editing Memory in a Transformer
Kevin Meng, Arnab Sen Sharma, ... (+3 more)
|
💤
Eternal Rest
|
cs.CL
|
844 |
3 years ago |
| 153 |
Do Deep Generative Models Know What They Don't Know?
Eric Nalisnick, Akihiro Matsukawa, ... (+3 more)
|
👻
Ghosted
|
stat.ML
|
834 |
7 years ago |
| 154 |
Multi-task Sequence to Sequence Learning
Minh-Thang Luong, Quoc V. Le, ... (+3 more)
|
👻
Ghosted
|
cs.LG
|
833 |
10 years ago |
| 155 |
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang, Alexandre Variengien, ... (+3 more)
|
👻
Ghosted
|
cs.LG
|
830 |
3 years ago |
| 156 |
PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples
Yang Song, Taesup Kim, ... (+3 more)
|
👻
Ghosted
|
cs.LG
|
827 |
8 years ago |
| 157 |
Learning Robust Rewards with Adversarial Inverse Reinforcement Learning
Justin Fu, Katie Luo, Sergey Levine
|
👻
Ghosted
|
cs.LG
|
824 |
8 years ago |
| 158 |
Editing Models with Task Arithmetic
Gabriel Ilharco, Marco Tulio Ribeiro, ... (+5 more)
|
👻
Ghosted
|
cs.LG
|
821 |
3 years ago |
| 159 |
InCoder: A Generative Model for Code Infilling and Synthesis
Daniel Fried, Armen Aghajanyan, ... (+8 more)
|
👻
Ghosted
|
cs.SE
|
820 |
4 years ago |
| 160 |
Model compression via distillation and quantization
Antonio Polino, Razvan Pascanu, Dan Alistarh
|
👻
Ghosted
|
cs.NE
|
809 |
8 years ago |
| 161 |
Unsupervised Neural Machine Translation
Mikel Artetxe, Gorka Labaka, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
806 |
8 years ago |
| 162 |
Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality
Xingjun Ma, Bo Li, ... (+7 more)
|
👻
Ghosted
|
cs.LG
|
804 |
8 years ago |
| 163 |
Sample Efficient Actor-Critic with Experience Replay
Ziyu Wang, Victor Bapst, ... (+5 more)
|
👻
Ghosted
|
cs.LG
|
803 |
9 years ago |
| 164 |
The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision
Jiayuan Mao, Chuang Gan, ... (+3 more)
|
🌅
Old Age
|
cs.CV
|
797 |
6 years ago |
| 165 |
Compressive Transformers for Long-Range Sequence Modelling
Jack W. Rae, Anna Potapenko, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
796 |
6 years ago |
| 166 |
SMASH: One-Shot Model Architecture Search through HyperNetworks
Andrew Brock, Theodore Lim, ... (+2 more)
|
🌅
Old Age
|
cs.LG
|
795 |
8 years ago |
| 167 |
Variational Continual Learning
Cuong V. Nguyen, Yingzhen Li, ... (+2 more)
|
👻
Ghosted
|
stat.ML
|
788 |
8 years ago |
| 168 |
Synthetic and Natural Noise Both Break Neural Machine Translation
Yonatan Belinkov, Yonatan Bisk
|
👻
Ghosted
|
cs.CL
|
786 |
8 years ago |
| 169 |
Learning End-to-End Goal-Oriented Dialog
Antoine Bordes, Y-Lan Boureau, Jason Weston
|
👻
Ghosted
|
cs.CL
|
785 |
9 years ago |
| 170 |
Reasoning about Entailment with Neural Attention
Tim Rocktäschel, Edward Grefenstette, ... (+3 more)
|
👻
Ghosted
|
cs.CL
|
776 |
10 years ago |
| 171 |
Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks
Jost Tobias Springenberg
|
👻
Ghosted
|
stat.ML
|
771 |
10 years ago |
| 172 |
Policy Distillation
Andrei A. Rusu, Sergio Gomez Colmenarejo, ... (+7 more)
|
👻
Ghosted
|
cs.LG
|
771 |
10 years ago |
| 173 |
Delving Deeper into Convolutional Networks for Learning Video Representations
Nicolas Ballas, Li Yao, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
764 |
10 years ago |
| 174 |
Large-Scale Study of Curiosity-Driven Learning
Yuri Burda, Harri Edwards, ... (+4 more)
|
👻
Ghosted
|
cs.LG
|
748 |
7 years ago |
| 175 |
Large Language Models Cannot Self-Correct Reasoning Yet
Jie Huang, Xinyun Chen, ... (+5 more)
|
👻
Ghosted
|
cs.CL
|
748 |
2 years ago |
| 176 |
Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking
Aleksandar Bojchevski, Stephan Günnemann
|
👻
Ghosted
|
stat.ML
|
739 |
8 years ago |
| 177 |
Emergent Tool Use From Multi-Agent Autocurricula
Bowen Baker, Ingmar Kanitscheider, ... (+5 more)
|
👻
Ghosted
|
cs.LG
|
733 |
6 years ago |
| 178 |
Nesterov Accelerated Gradient and Scale Invariance for Adversarial Attacks
Jiadong Lin, Chuanbiao Song, ... (+3 more)
|
👻
Ghosted
|
cs.LG
|
733 |
6 years ago |
| 179 |
Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning
Yanbin Liu, Juho Lee, ... (+5 more)
|
🌅
Old Age
|
cs.LG
|
730 |
7 years ago |
| 180 |
Visualizing Deep Neural Network Decisions: Prediction Difference Analysis
Luisa M Zintgraf, Taco S Cohen, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
730 |
9 years ago |
| 181 |
Learning Representations from EEG with Deep Recurrent-Convolutional Neural Networks
Pouya Bashivan, Irina Rish, ... (+2 more)
|
👻
Ghosted
|
cs.LG
|
723 |
10 years ago |
| 182 |
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
Alexei Baevski, Steffen Schneider, Michael Auli
|
👻
Ghosted
|
cs.CL
|
721 |
6 years ago |
| 183 |
Net2Net: Accelerating Learning via Knowledge Transfer
Tianqi Chen, Ian Goodfellow, Jonathon Shlens
|
👻
Ghosted
|
cs.LG
|
713 |
10 years ago |
| 184 |
Variational Lossy Autoencoder
Xi Chen, Diederik P. Kingma, ... (+6 more)
|
👻
Ghosted
|
cs.LG
|
696 |
9 years ago |
| 185 |
Dynamic Coattention Networks For Question Answering
Caiming Xiong, Victor Zhong, Richard Socher
|
👻
Ghosted
|
cs.CL
|
694 |
9 years ago |
| 186 |
Fantastic Generalization Measures and Where to Find Them
Yiding Jiang, Behnam Neyshabur, ... (+3 more)
|
👻
Ghosted
|
cs.LG
|
687 |
6 years ago |
| 187 |
Scalable Private Learning with PATE
Nicolas Papernot, Shuang Song, ... (+4 more)
|
👻
Ghosted
|
stat.ML
|
683 |
8 years ago |
| 188 |
DiffEdit: Diffusion-based semantic image editing with mask guidance
Guillaume Couairon, Jakob Verbeek, ... (+2 more)
|
👻
Ghosted
|
cs.CV
|
674 |
3 years ago |
| 189 |
Reducing Transformer Depth on Demand with Structured Dropout
Angela Fan, Edouard Grave, Armand Joulin
|
👻
Ghosted
|
cs.LG
|
668 |
6 years ago |
| 190 |
The Variational Fair Autoencoder
Christos Louizos, Kevin Swersky, ... (+3 more)
|
👻
Ghosted
|
stat.ML
|
666 |
10 years ago |
| 191 |
Neural Text Generation with Unlikelihood Training
Sean Welleck, Ilia Kulikov, ... (+4 more)
|
👻
Ghosted
|
cs.LG
|
663 |
6 years ago |
| 192 |
Parameter Space Noise for Exploration
Matthias Plappert, Rein Houthooft, ... (+7 more)
|
👻
Ghosted
|
cs.LG
|
662 |
8 years ago |
| 193 |
An Actor-Critic Algorithm for Sequence Prediction
Dzmitry Bahdanau, Philemon Brakel, ... (+6 more)
|
👻
Ghosted
|
cs.LG
|
662 |
9 years ago |
| 194 |
Revisiting Batch Normalization For Practical Domain Adaptation
Yanghao Li, Naiyan Wang, ... (+3 more)
|
👻
Ghosted
|
cs.CV
|
656 |
10 years ago |
| 195 |
Generating Natural Adversarial Examples
Zhengli Zhao, Dheeru Dua, Sameer Singh
|
👻
Ghosted
|
cs.LG
|
646 |
8 years ago |
| 196 |
The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations
Felix Hill, Antoine Bordes, ... (+2 more)
|
👻
Ghosted
|
cs.CL
|
646 |
10 years ago |
| 197 |
Central Moment Discrepancy (CMD) for Domain-Invariant Representation Learning
Werner Zellinger, Thomas Grubinger, ... (+3 more)
|
👻
Ghosted
|
stat.ML
|
645 |
9 years ago |
| 198 |
What learning algorithm is in-context learning? Investigations with linear models
Ekin Akyürek, Dale Schuurmans, ... (+3 more)
|
💤
Eternal Rest
|
cs.LG
|
644 |
3 years ago |
| 199 |
A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks
Behnam Neyshabur, Srinadh Bhojanapalli, Nathan Srebro
|
👻
Ghosted
|
cs.LG
|
643 |
8 years ago |
| 200 |
All you need is a good init
Dmytro Mishkin, Jiri Matas
|
👻
Ghosted
|
cs.LG
|
642 |
10 years ago |