Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering - Archive ouverte HAL Access content directly
Preprints, Working Papers, ... Year :

## Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering

(1) , (2) , (2) , (3) , (2) , (1)
1
2
3
Thibault Lesieur
• Function : Author
Caterina de Bacco
• Function : Author
Jess Banks
• Function : Author
Florent Krzakala
• Function : Author
Cris Moore
• Function : Author
Lenka Zdeborová
• Function : Author

#### Abstract

We consider the problem of Gaussian mixture clustering in the high-dimensional limit where the data consists of $m$ points in $n$ dimensions, $n,m \rightarrow \infty$ and $\alpha = m/n$ stays finite. Using exact but non-rigorous methods from statistical physics, we determine the critical value of $\alpha$ and the distance between the clusters at which it becomes information-theoretically possible to reconstruct the membership into clusters better than chance. We also determine the accuracy achievable by the Bayes-optimal estimation algorithm. In particular, we find that when the number of clusters is sufficiently large, $r > 4 + 2 \sqrt{\alpha}$, there is a gap between the threshold for information-theoretically optimal performance and the threshold at which known algorithms succeed.

### Dates and versions

cea-01448112 , version 1 (27-01-2017)

### Identifiers

• HAL Id : cea-01448112 , version 1
• ARXIV :

### Cite

Thibault Lesieur, Caterina de Bacco, Jess Banks, Florent Krzakala, Cris Moore, et al.. Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering. 2016. ⟨cea-01448112⟩

### Export

BibTeX TEI Dublin Core DC Terms EndNote Datacite

212 View