Prompt Compression for Large Language Models: A Survey
October 16, 2024 ยท The Cartographer ยท ๐ North American Chapter of the Association for Computational Linguistics
"No code URL or promise found in abstract"
"Title-pattern auto-detect: Prompt Compression for Large Language Models: A Survey"
Evidence collected by the PWNC Scanner
Authors
Zongqian Li, Yinhong Liu, Yixuan Su, Nigel Collier
arXiv ID
2410.12388
Category
cs.CL: Computation & Language
Citations
48
Venue
North American Chapter of the Association for Computational Linguistics
Last Checked
23 hours ago
Abstract
Leveraging large language models (LLMs) for complex natural language tasks typically requires long-form prompts to convey detailed requirements and information, which results in increased memory usage and inference costs. To mitigate these challenges, multiple efficient methods have been proposed, with prompt compression gaining significant research interest. This survey provides an overview of prompt compression techniques, categorized into hard prompt methods and soft prompt methods. First, the technical approaches of these methods are compared, followed by an exploration of various ways to understand their mechanisms, including the perspectives of attention optimization, Parameter-Efficient Fine-Tuning (PEFT), modality integration, and new synthetic language. We also examine the downstream adaptations of various prompt compression techniques. Finally, the limitations of current prompt compression methods are analyzed, and several future directions are outlined, such as optimizing the compression encoder, combining hard and soft prompts methods, and leveraging insights from multimodality.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Computation & Language
๐
๐
Old Age
๐
๐
Old Age
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
๐
๐
Old Age
XLNet: Generalized Autoregressive Pretraining for Language Understanding
๐๏ธ
๐๏ธ
Transcended
Effective Approaches to Attention-based Neural Machine Translation
๐
๐
Old Age
A large annotated corpus for learning natural language inference
๐
๐
Old Age