From Frege to chatGPT: Compositionality in language, cognition, and deep neural networks
May 24, 2024 ยท Declared Dead ยท ๐ arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Jacob Russin, Sam Whitman McGrath, Danielle J. Williams
arXiv ID
2405.15164
Category
cs.NE: Neural & Evolutionary
Cross-listed
cs.AI,
cs.LG
Citations
8
Venue
arXiv.org
Last Checked
4 months ago
Abstract
Compositionality has long been considered a key explanatory property underlying human intelligence: arbitrary concepts can be composed into novel complex combinations, permitting the acquisition of an open ended, potentially infinite expressive capacity from finite learning experiences. Influential arguments have held that neural networks fail to explain this aspect of behavior, leading many to dismiss them as viable models of human cognition. Over the last decade, however, modern deep neural networks (DNNs), which share the same fundamental design principles as their predecessors, have come to dominate artificial intelligence, exhibiting the most advanced cognitive behaviors ever demonstrated in machines. In particular, large language models (LLMs), DNNs trained to predict the next word on a large corpus of text, have proven capable of sophisticated behaviors such as writing syntactically complex sentences without grammatical errors, producing cogent chains of reasoning, and even writing original computer programs -- all behaviors thought to require compositional processing. In this chapter, we survey recent empirical work from machine learning for a broad audience in philosophy, cognitive science, and neuroscience, situating recent breakthroughs within the broader context of philosophical arguments about compositionality. In particular, our review emphasizes two approaches to endowing neural networks with compositional generalization capabilities: (1) architectural inductive biases, and (2) metalearning, or learning to learn. We also present findings suggesting that LLM pretraining can be understood as a kind of metalearning, and can thereby equip DNNs with compositional generalization abilities in a similar way. We conclude by discussing the implications that these findings may have for the study of compositionality in human cognition and by suggesting avenues for future research.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Neural & Evolutionary
๐ฎ
๐ฎ
The Ethereal
R.I.P.
๐ป
Ghosted
Deep Learning using Rectified Linear Units (ReLU)
R.I.P.
๐ป
Ghosted
Generative Adversarial Text to Image Synthesis
R.I.P.
๐ป
Ghosted
Regularized Evolution for Image Classifier Architecture Search
R.I.P.
๐ป
Ghosted
Temporal Ensembling for Semi-Supervised Learning
๐
๐
Old Age
Learning Structured Sparsity in Deep Neural Networks
Died the same way โ ๐ป Ghosted
R.I.P.
๐ป
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
๐ป
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
๐ป
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
๐ป
Ghosted