Time-Sensitive Bayesian Information Aggregation for Crowdsourcing Systems
October 21, 2015 Β· Declared Dead Β· π Journal of Artificial Intelligence Research
"No code URL or promise found in abstract"
Evidence collected by the PWNC Scanner
Authors
Matteo Venanzi, John Guiver, Pushmeet Kohli, Nick Jennings
arXiv ID
1510.06335
Category
cs.AI: Artificial Intelligence
Cross-listed
cs.LG
Citations
32
Venue
Journal of Artificial Intelligence Research
Last Checked
4 months ago
Abstract
Crowdsourcing systems commonly face the problem of aggregating multiple judgments provided by potentially unreliable workers. In addition, several aspects of the design of efficient crowdsourcing processes, such as defining worker's bonuses, fair prices and time limits of the tasks, involve knowledge of the likely duration of the task at hand. Bringing this together, in this work we introduce a new time--sensitive Bayesian aggregation method that simultaneously estimates a task's duration and obtains reliable aggregations of crowdsourced judgments. Our method, called BCCTime, builds on the key insight that the time taken by a worker to perform a task is an important indicator of the likely quality of the produced judgment. To capture this, BCCTime uses latent variables to represent the uncertainty about the workers' completion time, the tasks' duration and the workers' accuracy. To relate the quality of a judgment to the time a worker spends on a task, our model assumes that each task is completed within a latent time window within which all workers with a propensity to genuinely attempt the labelling task (i.e., no spammers) are expected to submit their judgments. In contrast, workers with a lower propensity to valid labeling, such as spammers, bots or lazy labelers, are assumed to perform tasks considerably faster or slower than the time required by normal workers. Specifically, we use efficient message-passing Bayesian inference to learn approximate posterior probabilities of (i) the confusion matrix of each worker, (ii) the propensity to valid labeling of each worker, (iii) the unbiased duration of each task and (iv) the true label of each task. Using two real-world public datasets for entity linking tasks, we show that BCCTime produces up to 11% more accurate classifications and up to 100% more informative estimates of a task's duration compared to state-of-the-art methods.
Community Contributions
Found the code? Know the venue? Think something is wrong? Let us know!
π Similar Papers
In the same crypt β Artificial Intelligence
π
π
The Cartographer
R.I.P.
π»
Ghosted
Explanation in Artificial Intelligence: Insights from the Social Sciences
R.I.P.
π»
Ghosted
Federated Machine Learning: Concept and Applications
R.I.P.
π»
Ghosted
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR
R.I.P.
π»
Ghosted
DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks
R.I.P.
π»
Ghosted
Rainbow: Combining Improvements in Deep Reinforcement Learning
Died the same way β π» Ghosted
R.I.P.
π»
Ghosted
Federated Learning: Strategies for Improving Communication Efficiency
R.I.P.
π»
Ghosted
In-Datacenter Performance Analysis of a Tensor Processing Unit
R.I.P.
π»
Ghosted
Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning
R.I.P.
π»
Ghosted