Mutual Enhancement of Large and Small Language Models with Cross-Silo Knowledge Transfer

December 10, 2023 Β· Declared Dead Β· πŸ› arXiv.org

πŸ‘» CAUSE OF DEATH: Ghosted
No code link whatsoever

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Yongheng Deng, Ziqing Qiao, Ju Ren, Yang Liu, Yaoxue Zhang arXiv ID 2312.05842 Category cs.AI: Artificial Intelligence Cross-listed cs.CL Citations 16 Venue arXiv.org Last Checked 4 months ago
Abstract
While large language models (LLMs) are empowered with broad knowledge, their task-specific performance is often suboptimal. It necessitates fine-tuning LLMs with task-specific data, but such data may be inaccessible due to privacy concerns. In this paper, we propose a novel approach to enhance LLMs with smaller language models (SLMs) that are trained on clients using their private task-specific data. To enable mutual enhancement between LLMs and SLMs, we propose CrossLM, where the SLMs promote the LLM to generate task-specific high-quality data, and both the LLM and SLMs are enhanced with the generated data. We evaluate CrossLM using publicly accessible language models across a range of benchmark tasks. The results demonstrate that CrossLM significantly enhances the task-specific performance of SLMs on clients and the LLM on the cloud server simultaneously while preserving the LLM's generalization capability.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

πŸ“œ Similar Papers

In the same crypt β€” Artificial Intelligence

Died the same way β€” πŸ‘» Ghosted