One Step to Efficient Synthetic Data
June 03, 2020 Β· Declared Dead Β· π Statistica sinica
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Jordan Awan, Zhanrui Cai
arXiv ID
2006.02397
Category
math.ST
Cross-listed
cs.CR,
stat.CO
Citations
7
Venue
Statistica sinica
Last Checked
2 months ago
Abstract
A common approach to synthetic data is to sample from a fitted model. We show that under general assumptions, this approach results in a sample with inefficient estimators and whose joint distribution is inconsistent with the true distribution. Motivated by this, we propose a general method of producing synthetic data, which is widely applicable for parametric models, has asymptotically efficient summary statistics, and is both easily implemented and highly computationally efficient. Our approach allows for the construction of both partially synthetic datasets, which preserve certain summary statistics, as well as fully synthetic data which satisfy the strong guarantee of differential privacy (DP), both with the same asymptotic guarantees. We also provide theoretical and empirical evidence that the distribution from our procedure converges to the true distribution. Besides our focus on synthetic data, our procedure can also be used to perform approximate hypothesis tests in the presence of intractable likelihood functions.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β math.ST
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
An introduction to Topological Data Analysis: fundamental and practical aspects for data scientists
R.I.P.
π»
Ghosted
Minimax Optimal Procedures for Locally Private Estimation
R.I.P.
π»
Ghosted
Optimal Best Arm Identification with Fixed Confidence
R.I.P.
π»
Ghosted
Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees
R.I.P.
π»
Ghosted
User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted