Extrapolating the profile of a finite population
May 21, 2020 Β· Declared Dead Β· π Annual Conference Computational Learning Theory
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Soham Jana, Yury Polyanskiy, Yihong Wu
arXiv ID
2005.10561
Category
math.ST
Cross-listed
cs.IT,
stat.ML
Citations
6
Venue
Annual Conference Computational Learning Theory
Last Checked
2 months ago
Abstract
We study a prototypical problem in empirical Bayes. Namely, consider a population consisting of $k$ individuals each belonging to one of $k$ types (some types can be empty). Without any structural restrictions, it is impossible to learn the composition of the full population having observed only a small (random) subsample of size $m = o(k)$. Nevertheless, we show that in the sublinear regime of $m =Ο(k/\log k)$, it is possible to consistently estimate in total variation the \emph{profile} of the population, defined as the empirical distribution of the sizes of each type, which determines many symmetric properties of the population. We also prove that in the linear regime of $m=c k$ for any constant $c$ the optimal rate is $Ξ(1/\log k)$. Our estimator is based on Wolfowitz's minimum distance method, which entails solving a linear program (LP) of size $k$. We show that there is a single infinite-dimensional LP whose value simultaneously characterizes the risk of the minimum distance estimator and certifies its minimax optimality. The sharp convergence rate is obtained by evaluating this LP using complex-analytic techniques.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β math.ST
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
An introduction to Topological Data Analysis: fundamental and practical aspects for data scientists
R.I.P.
π»
Ghosted
Minimax Optimal Procedures for Locally Private Estimation
R.I.P.
π»
Ghosted
Optimal Best Arm Identification with Fixed Confidence
R.I.P.
π»
Ghosted
Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees
R.I.P.
π»
Ghosted
User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Language Models are Few-Shot Learners
R.I.P.
π»
Ghosted
PyTorch: An Imperative Style, High-Performance Deep Learning Library
R.I.P.
π»
Ghosted
XGBoost: A Scalable Tree Boosting System
R.I.P.
π»
Ghosted