Avoiding Overfitting in Variable-Order Markov Models: a Cross-Validation Approach
January 24, 2025 Β· Declared Dead Β· π arXiv.org
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Valeria Secchini, Javier Garcia-Bernardo, Petr JanskΓ½
arXiv ID
2501.14476
Category
physics.soc-ph
Cross-listed
cs.SI,
econ.GN
Citations
0
Venue
arXiv.org
Last Checked
4 months ago
Abstract
Higher$\text{-}$order Markov chain models are widely used to represent agent transitions in dynamic systems, such as passengers in transport networks. They capture transitions in complex systems by considering not only the current state but also the path of previously visited states. For example, the likelihood of train passengers traveling from Paris (current state) to Rome could increase significantly if their journey originated in Italy (prior state). Although this approach provides a more faithful representation of the system than first$\text{-}$order models, we find that commonly used methods$-$relying on Kullback$\text{-}$Leibler divergence$-$frequently overfit the data, mistaking fluctuations for higher$\text{-}$order dependencies and undermining forecasts and resource allocation. Here, we introduce DIVOP (Detection of Informative Variable$\text{-}$Order Paths), an algorithm that employs cross$\text{-}$validation to robustly distinguish meaningful higher$\text{-}$order dependencies from noise. In both synthetic and real$\text{-}$world datasets, DIVOP outperforms two state$\text{-}$of$\text{-}$the$\text{-}$art algorithms by achieving higher precision, recall, and sparser representations of the underlying dynamics. When applied to global corporate ownership data, DIVOP reveals that tax havens appear in 82$\%$ of all significant higher$\text{-}$order dependencies, underscoring their outsized influence in corporate networks. By mitigating overfitting, DIVOP enables more reliable multi$\text{-}$step predictions and decision$\text{-}$making, paving the way toward deeper insights into the hidden structures that drive modern interconnected systems.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β physics.soc-ph
π
π
The Cartographer
R.I.P.
π»
Ghosted
Networks beyond pairwise interactions: structure and dynamics
R.I.P.
π»
Ghosted
Statistical physics of human cooperation
R.I.P.
π»
Ghosted
Vital nodes identification in complex networks
R.I.P.
π»
Ghosted
Influence maximization in complex networks through optimal percolation
R.I.P.
π»
Ghosted
Scale-free networks are rare
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted