Folded Context Condensation in Path Integral Formalism for Infinite Context Transformers

May 07, 2024 · Declared Dead · 🏛 IEEE Access

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Won-Gi Paeng, Daesuk Kwon, Kyungwon Jeong, Honggyo Suh arXiv ID 2405.04620 Category hep-ph Cross-listed cs.AI, cs.CL, cs.LG, cs.NE Citations 0 Venue IEEE Access Last Checked 3 months ago

Abstract

In this work, we present a generalized formulation of the Transformer algorithm by reinterpreting its core mechanisms within the framework of Path Integral formalism. In this perspective, the attention mechanism is recast as a process that integrates all possible transition paths leading to future token states, with temporal evolution governed by the Feed-Forward Network. By systematically mapping each component of the Transformer to its counterpart in the Path Integral formulation, we obtain a more compact and efficient representation, in which the contextual information of a sequence is condensed into memory-like segments. These segments are recurrently processed across Transformer layers, enabling more effective long-term information retention. We validate the effectiveness of this approach through the Passkey retrieval task and a summarization task, demonstrating that the proposed method preserves historical information while exhibiting memory usage that scales linearly with sequence length. This contrasts with the non-linear memory growth typically observed in standard attention mechanisms. We expect that this quantum-inspired generalization of the Transformer architecture will open new avenues for enhancing both the efficiency and expressiveness of future Transformer models.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — hep-ph

R.I.P. 👻 Ghosted

How to GAN away Detector Effects

Marco Bellagente, Anja Butter, ... (+3 more)

hep-ph 🏛 SciPost Physics 📚 100 cites 6 years ago

R.I.P. 👻 Ghosted

CaloMan: Fast generation of calorimeter showers with density estimation on learned manifolds

Jesse C. Cresswell, Brendan Leigh Ross, ... (+4 more)

hep-ph 🏛 arXiv 📚 53 cites 3 years ago

R.I.P. 👻 Ghosted

An unfolding method based on conditional Invertible Neural Networks (cINN) using iterative training

Mathias Backes, Anja Butter, ... (+2 more)

hep-ph 🏛 SciPost Physics Core 📚 52 cites 3 years ago

R.I.P. 👻 Ghosted

PELICAN: Permutation Equivariant and Lorentz Invariant or Covariant Aggregator Network for Particle Physics

Alexander Bogatskiy, Timothy Hoffman, ... (+2 more)

hep-ph 🏛 arXiv 📚 39 cites 3 years ago

R.I.P. 👻 Ghosted

Stacking machine learning classifiers to identify Higgs bosons at the LHC

Alexandre Alves

hep-ph 🏛 arXiv 📚 32 cites 9 years ago

R.I.P. 👻 Ghosted

The Power of Genetic Algorithms: what remains of the pMSSM?

Steven Abel, David G. Cerdeno, Sandra Robles

hep-ph 🏛 arXiv 📚 20 cites 8 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Federated Learning: Strategies for Improving Communication Efficiency

Jakub Konečný, H. Brendan McMahan, ... (+4 more)

cs.LG 🏛 arXiv 📚 5.2K cites 9 years ago

R.I.P. 👻 Ghosted

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi, Cliff Young, ... (+73 more)

cs.AR 🏛 ISCA 📚 5.1K cites 9 years ago

R.I.P. 👻 Ghosted

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

Hoo-Chang Shin, Holger R. Roth, ... (+7 more)

cs.CV 🏛 IEEE TMI 📚 4.9K cites 10 years ago

R.I.P. 👻 Ghosted

Explanation in Artificial Intelligence: Insights from the Social Sciences

Tim Miller

cs.AI 🏛 AI 📚 4.9K cites 8 years ago