Content-Aware Differential Privacy with Conditional Invertible Neural Networks

July 29, 2022 · Entered Twilight · 🏛 DeCaF/FAIR@MICCAI

Repo contents: .gitignore, LICENSE, README.md, categorical.ipynb, classifier.py, configs, create_priv_oct.py, create_priv_xray.py, data.py, distribution.ipynb, eval_classifier.ipynb, eval_cond.ipynb, faces.ipynb, inspect_datasets.ipynb, latent_dp.py, models.py, requirements.txt, train.py, utils.py, viz.py

Authors Malte Tölle, Ullrich Köthe, Florian André, Benjamin Meder, Sandy Engelhardt arXiv ID 2207.14625 Category cs.CR: Cryptography & Security Cross-listed cs.CV, cs.LG Citations 5 Venue DeCaF/FAIR@MICCAI Repository https://github.com/Cardio-AI/CADP Last Checked 2 months ago

Abstract

Differential privacy (DP) has arisen as the gold standard in protecting an individual's privacy in datasets by adding calibrated noise to each data sample. While the application to categorical data is straightforward, its usability in the context of images has been limited. Contrary to categorical data the meaning of an image is inherent in the spatial correlation of neighboring pixels making the simple application of noise infeasible. Invertible Neural Networks (INN) have shown excellent generative performance while still providing the ability to quantify the exact likelihood. Their principle is based on transforming a complicated distribution into a simple one e.g. an image into a spherical Gaussian. We hypothesize that adding noise to the latent space of an INN can enable differentially private image modification. Manipulation of the latent space leads to a modified image while preserving important details. Further, by conditioning the INN on meta-data provided with the dataset we aim at leaving dimensions important for downstream tasks like classification untouched while altering other parts that potentially contain identifying information. We term our method content-aware differential privacy (CADP). We conduct experiments on publicly available benchmarking datasets as well as dedicated medical ones. In addition, we show the generalizability of our method to categorical data. The source code is publicly available at https://github.com/Cardio-AI/CADP.