Fair Diversity Maximization with Few Representatives
June 09, 2025 Β· Declared Dead Β· π Knowledge Discovery and Data Mining
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Florian Adriaens, Nikolaj Tatti
arXiv ID
2506.08110
Category
cs.DS: Data Structures & Algorithms
Citations
0
Venue
Knowledge Discovery and Data Mining
Last Checked
4 months ago
Abstract
Diversity maximization problem is a well-studied problem where the goal is to find $k$ diverse items. Fair diversity maximization aims to select a diverse subset of $k$ items from a large dataset, while requiring that each group of items be well represented in the output. More formally, given a set of items with labels, our goal is to find $k$ items that maximize the minimum pairwise distance in the set, while maintaining that each label is represented within some budget. In many cases, one is only interested in selecting a handful (say a constant) number of items from each group. In such scenario we show that a randomized algorithm based on padded decompositions improves the state-of-the-art approximation ratio to $\sqrt{\log(m)}/(3m)$, where $m$ is the number of labels. The algorithms work in several stages: ($i$) a preprocessing pruning which ensures that points with the same label are far away from each other, ($ii$) a decomposition phase, where points are randomly placed in clusters such that there is a feasible solution with maximum one point per cluster and that any feasible solution will be diverse, $(iii)$ assignment phase, where clusters are assigned to labels, and a representative point with the corresponding label is selected from each cluster. We experimentally verify the effectiveness of our algorithm on large datasets.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Data Structures & Algorithms
π
π
The Cartographer
R.I.P.
π»
Ghosted
Route Planning in Transportation Networks
R.I.P.
π»
Ghosted
Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration
R.I.P.
π»
Ghosted
Hierarchical Clustering: Objective Functions and Algorithms
R.I.P.
π»
Ghosted
Graph Isomorphism in Quasipolynomial Time
π
π
The Cartographer
Simulation optimization: A review of algorithms and applications
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted