The dynamic interplay between in-context and in-weight learning in humans and neural networks
February 13, 2024 ยท Declared Dead ยท ๐ Proceedings of the National Academy of Sciences of the United States of America
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Jacob Russin, Ellie Pavlick, Michael J. Frank
arXiv ID
2402.08674
Category
cs.NE: Neural & Evolutionary
Cross-listed
cs.LG,
q-bio.NC
Citations
5
Venue
Proceedings of the National Academy of Sciences of the United States of America
Last Checked
4 months ago
Abstract
Human learning embodies a striking duality: sometimes, we appear capable of following logical, compositional rules and benefit from structured curricula (e.g., in formal education), while other times, we rely on an incremental approach or trial-and-error, learning better from curricula that are randomly interleaved. Influential psychological theories explain this seemingly disparate behavioral evidence by positing two qualitatively different learning systems -- one for rapid, rule-based inferences and another for slow, incremental adaptation. It remains unclear how to reconcile such theories with neural networks, which learn via incremental weight updates and are thus a natural model for the latter type of learning, but are not obviously compatible with the former. However, recent evidence suggests that metalearning neural networks and large language models are capable of "in-context learning" (ICL) -- the ability to flexibly grasp the structure of a new task from a few examples. Here, we show that the dynamic interplay between ICL and default in-weight learning (IWL) naturally captures a broad range of learning phenomena observed in humans, reproducing curriculum effects on category-learning and compositional tasks, and recapitulating a tradeoff between flexibility and retention. Our work shows how emergent ICL can equip neural networks with fundamentally different learning properties that can coexist with their native IWL, thus offering a novel perspective on dual-process theories and human cognitive flexibility.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Neural & Evolutionary
๐ฎ
๐ฎ
The Ethereal
R.I.P.
๐ป
Ghosted
Deep Learning using Rectified Linear Units (ReLU)
R.I.P.
๐ป
Ghosted
Generative Adversarial Text to Image Synthesis
R.I.P.
๐ป
Ghosted
Regularized Evolution for Image Classifier Architecture Search
R.I.P.
๐ป
Ghosted
Temporal Ensembling for Semi-Supervised Learning
๐
๐
Old Age
Learning Structured Sparsity in Deep Neural Networks
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted