ProFLingo: A Fingerprinting-based Intellectual Property Protection Scheme for Large Language Models
May 03, 2024 ยท Entered Twilight ยท ๐ IEEE Conference on Communications and Network Security
Repo contents: .gitignore, LICENSE, README.md, attack.py, auditing-llms, copyright_test.py, finetuning.py, generated_llama.txt, generated_llama_arca.txt, generated_llama_gcg.txt, generated_mistral.txt, llm-attacks, proflingo.py, questions.csv, requirements.txt
Authors
Heng Jin, Chaoyu Zhang, Shanghao Shi, Wenjing Lou, Y. Thomas Hou
arXiv ID
2405.02466
Category
cs.CR: Cryptography & Security
Cross-listed
cs.LG
Citations
13
Venue
IEEE Conference on Communications and Network Security
Repository
https://github.com/hengvt/ProFLingo
โญ 6
Last Checked
2 months ago
Abstract
Large language models (LLMs) have attracted significant attention in recent years. Due to their "Large" nature, training LLMs from scratch consumes immense computational resources. Since several major players in the artificial intelligence (AI) field have open-sourced their original LLMs, an increasing number of individuals and smaller companies are able to build derivative LLMs based on these open-sourced models at much lower costs. However, this practice opens up possibilities for unauthorized use or reproduction that may not comply with licensing agreements, and fine-tuning can change the model's behavior, thus complicating the determination of model ownership. Current intellectual property (IP) protection schemes for LLMs are either designed for white-box settings or require additional modifications to the original model, which restricts their use in real-world settings. In this paper, we propose ProFLingo, a black-box fingerprinting-based IP protection scheme for LLMs. ProFLingo generates queries that elicit specific responses from an original model, thereby establishing unique fingerprints. Our scheme assesses the effectiveness of these queries on a suspect model to determine whether it has been derived from the original model. ProFLingo offers a non-invasive approach, which neither requires knowledge of the suspect model nor modifications to the base model or its training process. To the best of our knowledge, our method represents the first black-box fingerprinting technique for IP protection for LLMs. Our source code and generated queries are available at: https://github.com/hengvt/ProFLingo.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Cryptography & Security
R.I.P.
๐ป
Ghosted
R.I.P.
๐ป
Ghosted
Membership Inference Attacks against Machine Learning Models
R.I.P.
๐ป
Ghosted
The Limitations of Deep Learning in Adversarial Settings
R.I.P.
๐ป
Ghosted
Practical Black-Box Attacks against Machine Learning
R.I.P.
๐ป
Ghosted
Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks
R.I.P.
๐ป
Ghosted