Towards Confidential and Efficient LLM Inference with Dual Privacy Protection

September 11, 2025 · Declared Dead · 🏛 International Conference on Database Systems for Advanced Applications

"No code URL or promise found in abstract"

Evidence collected by the PWNC Scanner

Authors Honglan Yu, Yibin Wang, Feifei Dai, Dong Liu, Haihui Fan, Xiaoyan Gu arXiv ID 2509.09091 Category cs.CR: Cryptography & Security Cross-listed cs.AI Citations 0 Venue International Conference on Database Systems for Advanced Applications Last Checked 4 months ago

Abstract

CPU-based trusted execution environments (TEEs) and differential privacy (DP) have gained wide applications for private inference. Due to high inference latency in TEEs, researchers use partition-based approaches that offload linear model components to GPUs. However, dense nonlinear layers of large language models (LLMs) result in significant communication overhead between TEEs and GPUs. DP-based approaches apply random noise to protect data privacy, but this compromises LLM performance and semantic understanding. To overcome the above drawbacks, this paper proposes CMIF, a Confidential and efficient Model Inference Framework. CMIF confidentially deploys the embedding layer in the client-side TEE and subsequent layers on GPU servers. Meanwhile, it optimizes the Report-Noisy-Max mechanism to protect sensitive inputs with a slight decrease in model performance. Extensive experiments on Llama-series models demonstrate that CMIF reduces additional inference overhead in TEEs while preserving user data privacy.

📄 View on arXiv 🌐 View on ar5iv 📑 PDF 🎉 Report Code Found

Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

📜 Similar Papers

In the same crypt — Cryptography & Security

R.I.P. 👻 Ghosted

Membership Inference Attacks against Machine Learning Models

Reza Shokri, Marco Stronati, ... (+2 more)

cs.CR 🏛 IEEE S&P 📚 4.9K cites 9 years ago

R.I.P. 👻 Ghosted

The Limitations of Deep Learning in Adversarial Settings

Nicolas Papernot, Patrick McDaniel, ... (+4 more)

cs.CR 🏛 IEEE S&P 📚 4.2K cites 10 years ago

R.I.P. 👻 Ghosted

Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks

Nicolas Papernot, Patrick McDaniel, ... (+3 more)

cs.CR 🏛 IEEE S&P 📚 3.2K cites 10 years ago

R.I.P. 👻 Ghosted

Spectre Attacks: Exploiting Speculative Execution

Paul Kocher, Daniel Genkin, ... (+8 more)

cs.CR 🏛 IEEE S&P 📚 2.4K cites 8 years ago

R.I.P. 👻 Ghosted

How To Backdoor Federated Learning

Eugene Bagdasaryan, Andreas Veit, ... (+3 more)

cs.CR 🏛 AISTATS 📚 2.4K cites 8 years ago

R.I.P. 👻 Ghosted

Evasion Attacks against Machine Learning at Test Time

Battista Biggio, Igino Corona, ... (+6 more)

cs.CR 🏛 ECML/PKDD 📚 2.3K cites 8 years ago

Died the same way — 👻 Ghosted

R.I.P. 👻 Ghosted

Federated Learning: Strategies for Improving Communication Efficiency

Jakub Konečný, H. Brendan McMahan, ... (+4 more)

cs.LG 🏛 arXiv 📚 5.2K cites 9 years ago

R.I.P. 👻 Ghosted

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi, Cliff Young, ... (+73 more)

cs.AR 🏛 ISCA 📚 5.1K cites 9 years ago

R.I.P. 👻 Ghosted

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning

Hoo-Chang Shin, Holger R. Roth, ... (+7 more)

cs.CV 🏛 IEEE TMI 📚 4.9K cites 10 years ago

R.I.P. 👻 Ghosted

Explanation in Artificial Intelligence: Insights from the Social Sciences

Tim Miller

cs.AI 🏛 AI 📚 4.9K cites 9 years ago