Supplementary materials: Iterative Smoothing Proximal Gradient for Regression with Structured Sparsity

Abstract : Learning predictive models from high-dimensional brain imaging opens up for possibilities to decode cognitive states or diagnosis/prognosis of a clinical condition/evolution. Structured sparsity is a promising approach which alleviates the risk of overfitting while forcing the solution to adhere to some domain-specific constraints and thus offering new means to identify biomarkers. We consider the problem of optimizing the sum of a smooth convex loss and a non-smooth convex penalty, whose proximal operator is known, and a wide range of complex, non-smooth convex structured penalties such as \eg total variation, or overlapping group lasso. %We propose a generic optimization framework that can combine any smooth convex loss function with: (i) penalties whose proximal operator is known and (ii) with a large range of complex, non-smooth convex structured penalties such as total variation, or overlapping group lasso. Although many solvers have already been proposed, their practical use in the context of high-dimensional imaging data ($\geq10^5$ features) remains an open issue. We propose to smooth the structured penalty, since this approach allows a generic framework in which a large range of non-smooth convex structured penalties can be minimized without computing their proximal operators. This is beneficial since the proximal operators are either not known or expensive to compute. The problem can be minimized with an accelerated proximal gradient method, taking advantage of the potential (non-smoothed) sparsity-inducing penalties. As a first contribution we propose an expression of the duality gap to control the convergence of the global non-smooth problem. This expression is applicable to a large range of structured penalties. However, smoothing methods have many limitations that the proposed solver aims to overcome. Therefore, as a second contribution, we propose a continuation algorithm, called \textit{CONESTA}, that dynamically generates a decreasing sequence of smoothing parameters in order to maintain the optimal convergence speed towards any globally desired precision. At each continuation step, the aforementioned duality gap provides the current error and thus the next smaller prescribed precision. Given this precision, we propose a expression to calculate the optimal smoothing parameter, that minimizes the number of iterations to reach such precision. We demonstrate that CONESTA achieves an improved convergence rate compared to classical (without continuation) proximal gradient smoothing. Moreover, experiments conducted on both simulated and high-dimensional neuroimaging (MRI) data, exhibit that CONESTA significantly outperforms the excessive gap method, ADMM, classical proximal gradient smoothing and inexact FISTA in terms of convergence speed and/or precision of the solution.
Type de document :
Pré-publication, Document de travail
2016
Liste complète des métadonnées

https://hal-cea.archives-ouvertes.fr/cea-01324021
Contributeur : Edouard Duchesnay <>
Soumis le : lundi 21 novembre 2016 - 17:48:53
Dernière modification le : samedi 25 novembre 2017 - 01:20:50
Document(s) archivé(s) le : mardi 21 mars 2017 - 04:20:17

Fichiers

hal_ols_nestv_supp.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : cea-01324021, version 3

Collections

Citation

Fouad Hadj-Selem, Tommy Löfstedt, Elvis Dohmatob, Vincent Frouin, Mathieu Dubois, et al.. Supplementary materials: Iterative Smoothing Proximal Gradient for Regression with Structured Sparsity. 2016. 〈cea-01324021v3〉

Partager

Métriques

Consultations de la notice

451

Téléchargements de fichiers

280