Tail-Aware Information-Theoretic Generalization for RLHF and SGLD

April 12, 2026 ยท Grace Period ยท + Add venue

โณ Grace Period
This paper is less than 90 days old. We give authors time to release their code before passing judgment.
Authors Huiming Zhang, Binghan Li, Wan Tian, Qiang Sun arXiv ID 2604.10727 Category stat.ML: Machine Learning (Stat) Cross-listed cs.AI, cs.LG, math.PR, math.ST Citations 0
Abstract
Classical information-theoretic generalization bounds typically control the generalization gap through KL-based mutual information and therefore rely on boundedness or sub-Gaussian tails via the moment generating function (MGF). In many modern pipelines, such as robust learning, RLHF, and stochastic optimization, losses and rewards can be heavy-tailed, and MGFs may not exist, rendering KL-based tools ineffective. We develop a tail-dependent information-theoretic framework for sub-Weibull data, where the tail parameter $ฮธ$ controls the tail heaviness: $ฮธ=2$ corresponds to sub-Gaussian, $ฮธ=1$ to sub-exponential, and $0<ฮธ<1$ to genuinely heavy tails. Our key technical ingredient is a decorrelation lemma that bounds change-of-measure expectations using a shifted-log $f_ฮธ$-divergence, which admits explicit comparisons to Rรฉnyi divergence without MGF arguments. On the empirical-process side, we establish sharp maximal inequalities and a Dudley-type chaining bound for sub-Weibull processes with tail index $ฮธ$, with complexity scaling as $\log^{1/ฮธ}$ and entropy$^{1/ฮธ}$. These tools yield expected and high-probability PAC-Bayes generalization bounds, as well as an information-theoretic chaining inequality based on multiscale Rรฉnyi mutual information. We illustrate the consequences in Rรฉnyi-regularized RLHF under heavy-tailed rewards and in stochastic gradient Langevin dynamics with heavy-tailed gradient noise.
Community shame:
Not yet rated
Community Contributions

Found the code? Know the venue? Think something is wrong? Let us know!

๐Ÿ“œ Similar Papers

In the same crypt โ€” Machine Learning (Stat)

๐Ÿ”ฎ ๐Ÿ”ฎ The Ethereal

Layer Normalization

Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

stat.ML ๐Ÿ› arXiv ๐Ÿ“š 12.0K cites 9 years ago