Join Sampling under Acyclic Degree Constraints and (Cyclic) Subgraph Sampling
December 20, 2023 Β· Declared Dead Β· π International Conference on Database Theory
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Ru Wang, Yufei Tao
arXiv ID
2312.12797
Category
cs.DB: Databases
Cross-listed
cs.DS
Citations
3
Venue
International Conference on Database Theory
Last Checked
4 months ago
Abstract
Given a join with an acyclic set of degree constraints, we show how to draw a uniformly random sample from the join result in $O(\mathit{polymat}/ \max \{1, \mathrm{OUT} \})$ expected time after a preprocessing of $O(\mathrm{IN})$ expected time, where $\mathrm{IN}$, $\mathrm{OUT}$, and $\mathit{polymat}$ are the join's input size, output size, and polymatroid bound, respectively. This compares favorably with the state of the art (Deng et al.\ and Kim et al., both in PODS'23), which states that a uniformly random sample can be drawn in $\tilde{O}(\mathrm{AGM} / \max \{1, \mathrm{OUT}\})$ expected time after a preprocessing phase of $\tilde{O}(\mathrm{IN})$ expected time, where $\mathrm{AGM}$ is the join's AGM bound. We then utilize our techniques to tackle {\em directed subgraph sampling}. Let $G = (V, E)$ be a directed data graph where each vertex has an out-degree at most $Ξ»$, and let $P$ be a directed pattern graph with $O(1)$ vertices. The objective is to uniformly sample an occurrence of $P$ in $G$. The problem can be modeled as join sampling with input size $\mathrm{IN} = Ξ(|E|)$ but, whenever $P$ contains cycles, the converted join has {\em cyclic} degree constraints. We show that it is always possible to throw away certain degree constraints such that (i) the remaining constraints are acyclic and (ii) the new join has asymptotically the same polymatroid bound $\mathit{polymat}$ as the old one. Combining this finding with our new join sampling solution yields an algorithm to sample from the original (cyclic) join (thereby yielding a uniformly random occurrence of $P$) in $O(\mathit{polymat}/ \max \{1, \mathrm{OUT}\})$ expected time after $O(|E|)$ expected-time preprocessing. We also prove similar results for {\em undirected subgraph sampling} and demonstrate how our techniques can be significantly simplified in that scenario.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Databases
R.I.P.
π»
Ghosted
R.I.P.
π»
Ghosted
Untangling Blockchain: A Data Processing View of Blockchain Systems
R.I.P.
π»
Ghosted
Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades
R.I.P.
π»
Ghosted
BLOCKBENCH: A Framework for Analyzing Private Blockchains
R.I.P.
π»
Ghosted
Data Synthesis based on Generative Adversarial Networks
R.I.P.
π»
Ghosted
HoloClean: Holistic Data Repairs with Probabilistic Inference
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted