Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

Dense Limit of the Dawid-Skene Model for Crowdsourcing and Regions of Sub-optimality of Message Passing Algorithms

Abstract : Crowdsourcing is a strategy to categorize data through the contribution of many individuals. A wide range of theoretical and algorithmic contributions are based on the model of Dawid and Skene [1]. Recently it was shown in [2, 3] that, in certain regimes, belief propagation is asymptotically optimal for data generated from the Dawid-Skene model. This paper is motivated by this recent progress. We analyze the dense limit of the Dawid-Skene model. It is shown that it belongs to a larger class of low-rank matrix estimation problems for which it is possible to express the asymptotic, Bayes-optimal, performance in a simple closed form. In the dense limit the mapping to a low-rank matrix estimation problem provides an approximate message passing algorithm that solves the problem algorithmically. We identify the regions where the algorithm efficiently computes the Bayes-optimal estimates. Our analysis refines the results of [2, 3] about optimality of message passing algorithms by characterizing regions of parameters where these algorithms do not match the Bayes-optimal performance. We further study numerically the performance of approximate message passing, derived in the dense limit, on sparse instances and carry out experiments on a real world dataset.
Document type :
Preprints, Working Papers, ...
Complete list of metadata

Cited literature [19 references]  Display  Hide  Download
Contributor : Emmanuelle de Laborderie Connect in order to contact the contributor
Submitted on : Thursday, November 22, 2018 - 11:01:37 AM
Last modification on : Tuesday, January 4, 2022 - 4:41:11 AM
Long-term archiving on: : Saturday, February 23, 2019 - 1:34:44 PM


Files produced by the author(s)


  • HAL Id : cea-01930667, version 1


Christian Schmidt, Lenka Zdeborová. Dense Limit of the Dawid-Skene Model for Crowdsourcing and Regions of Sub-optimality of Message Passing Algorithms. 2018. ⟨cea-01930667⟩



Les métriques sont temporairement indisponibles