๐ฎ
๐ฎ
The Ethereal
LOTTERY: Learning from Reference-Only Samples in Two-Sample Testing under Size Asymmetry
June 07, 2026 ยท Grace Period ยท ๐ ICML 2026
Authors
Xunye Tian, Zhijian Zhou, Liuhua Peng, Feng Liu
arXiv ID
2606.08460
Category
stat.ML: Machine Learning (Stat)
Cross-listed
cs.LG
Citations
0
Venue
ICML 2026
Abstract
Data-adaptive two-sample testing assesses if two samples come from the same distribution, using a discrepancy learned from the data (e.g., via kernel-based feature representations). Such methods typically rely on data splitting to decouple learning from testing and control type I error. However, this paradigm is ill-suited to few-shot settings with severe sample-size imbalance: abundant reference samples are available, while only a handful of query samples arrive. In this paper, we show how this imbalance can be leveraged constructively. Using abundant reference data, we learn reference-dependent representations that summarize salient structure of the reference distribution and provide informative signals for detecting departures. We incorporate a collection of representation families that capture both global and local structure, and adaptively weight them using only reference samples via an uncertainty-guided principle. Theoretically, we establish permutation-based type I error control and show consistency of the aggregated test: as the sample sizes grow, the test power converges to one whenever the representation set contains at least one consistent representation. Empirically, our aggregation achieves strong performance across a range of benchmarks while retaining type I error control.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
๐ Similar Papers
In the same crypt โ Machine Learning (Stat)
๐ฎ
๐ฎ
The Ethereal
Layer Normalization
๐ฎ
๐ฎ
The Ethereal
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles
R.I.P.
๐ป
Ghosted
Variational Inference with Normalizing Flows
๐
๐
The Cartographer
Towards A Rigorous Science of Interpretable Machine Learning
R.I.P.
๐ป
Ghosted