Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering

Abstract : We consider the problem of Gaussian mixture clustering in the high-dimensional limit where the data consists of $m$ points in $n$ dimensions, $n,m \rightarrow \infty$ and $\alpha = m/n$ stays finite. Using exact but non-rigorous methods from statistical physics, we determine the critical value of $\alpha$ and the distance between the clusters at which it becomes information-theoretically possible to reconstruct the membership into clusters better than chance. We also determine the accuracy achievable by the Bayes-optimal estimation algorithm. In particular, we find that when the number of clusters is sufficiently large, $r > 4 + 2 \sqrt{\alpha}$, there is a gap between the threshold for information-theoretically optimal performance and the threshold at which known algorithms succeed.
Type de document :
Pré-publication, Document de travail
t16/176. 8 pages, 3 figures, conference. 2016
Liste complète des métadonnées

Littérature citée [24 références]  Voir  Masquer  Télécharger

https://hal-cea.archives-ouvertes.fr/cea-01448112
Contributeur : Emmanuelle De Laborderie <>
Soumis le : vendredi 27 janvier 2017 - 15:13:49
Dernière modification le : mardi 24 avril 2018 - 17:20:04
Document(s) archivé(s) le : samedi 29 avril 2017 - 03:53:59

Fichier

1610.02918v1.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Thibault Lesieur, Caterina De Bacco, Jess Banks, Florent Krzakala, Cris Moore, et al.. Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering. t16/176. 8 pages, 3 figures, conference. 2016. 〈cea-01448112〉

Partager

Métriques

Consultations de la notice

196

Téléchargements de fichiers

53