Grenoble Optimization Days 2018

        Optimization algorithms and applications in statistical learning

Grenoble Optimization Days is a serie of small-size iternational workshops co-organized by the DAO team of LJK. The broad topic of the 2018 session is ``optimization methods and applications in statistical learning'', and its goal is to foster exchanges between scientific communities working on high-dimensional data, and to report and discuss recent advances in optimization algorithms for machine learning and optimal transport. This session also celebrates the axe Grenoble-Moscow, and is co-organized with Alexander Gasnikov (MIPT, Moscou). The workshop will gather researchers, PhD students, and master students on mathematical optimization and statistical learning from Grenoble and russian partners from MIPT and HSE in Moscow. The targeted audience for the presentations are the researchers on optimization and learning as well as the PhD students of the Grenoble campus and the master students from the MSIAM Master track "Data Science".


Approximative program

Thrusday, June 28th, 2018

13:00 - 15:00   Get together at LJK picnic
15:00 - 15:40   Jerome Malick, CNRS (LJK, Grenoble)Miror-stratifiable regularizers Abstract [Slides]
Low-complexity non-smooth convex regularizers are routinely used to impose some structure (such as sparsity or low-rank) on solutions to optimization problems arising in machine learning and image processing. In this talk, I present a set of sensitivity analysis and activity identification results for a class of regularizers with a strong geometric structure, called ``mirror-stratifiable''. This notion brings into play a pair of primal-dual models, which in turn allows one to locate the structure of the solution using a specific dual certificate. This pairing is crucial to track the substructures that are identifiable by solutions of parametrized optimization problems or by iterates of first-order proximal algorithms. We also briefly discuss its consequences in model consistency in supervised learning.
15:40 - 16:00   Marina Danilova (Moscow)The non-monotonocity effect of accelerated optimization methods Abstract [Slides]
16:00 - 16:20   Alexander Titov (Moscow)Universal Proximal Method for Variational Inequalities Abstract [Slides]
16:20 - 16:50   coffee break
16:50 - 17:30   Alexander Gasnikov, MIPT (Moscow)Accelerated method with the model function conception Abstract [Slides]
17:30 - 17:50   Dmitry Grishchenko, LJK (Grenoble)Automatic Dimension Reduction Proximal Algorithm Abstract [Slides]
Optimization problems in machine learning and signal processing applications often involve objective functions made of a smooth data-fidelity term plus nonsmooth convex regularization term. Proximal algorithms are popular methods for solving such composite optimization problems. Most of the popular regularizers have in fact a strong geometric structure, which implies specific identification properties of proximal algorithms; more precisely, the convergence towards an optimal solution is such that after a finite number of iterations, all iterate lies in the low-complexity subspace related to the one of the optimal solution. In this work, we present a random proximal-gradient algorithm that uses this identification property to automatic reduce of the numerical cost of solving special composite problems, for instance when L1 and TV regularizers are used. Our algorithm naturally extends to a distributed optimization set-up, where a master-machine combines all the results computed in parallel by slave-machines. In this case, our proximal algorithm leverage on identification to gain in both faster gradient computations and lower cost of communications between machines.
from 20:00   workshop dinner

Friday, June 29th, 2018

09:00 - 09:40   Roland Hildebrand (Grenoble)Scaling points and reach for non-self-scaled barriers Abstract [Slides]
The theory of interior-point algorithms is most developed for the class of symmetric cones, which include the orthant, the Lorentz cone, and the semi-definite matrix cone, leading to the classes of linear programs, second-order cone programs, and semi-definite programs, respectively. Its success relies on the property of these cones to possess a self-scaled barrier. In particular, for every primal-dual pair of points in the product of the cone with its dual, there exists a so-called scaling point, which is used by primal-dual interior-point algorithms on symmetric cones to compute a descent direction for the next iteration. In this talk, we give a geometric interpretation of the scaling point in terms of an orthogonal projection onto a Lagrangian submanifold in the primal-dual product space. This approach allows to extend the notion of scaling point to arbitrary self-concordant barriers and convex cones. We show that there exists a tube of primal-dual pairs of points of positive thickness around the submanifold for which scaling points exist. In geometric terms this means that the submanifold has positive reach. The thickness of the tube is a monotone function of the barrier parameter.
09:40 - 10:00   Sergey Guminov (Moscow)Universal Line-Search Method Abstract [Slides]
10:00 - 10:20   Darina Dvinskikh (Moscow)Distributed computation of Wasserstein barycenter Abstract [Slides]
10:20 - 10:50   coffee break
10:50 - 11:30   Pavel Dvurechensky, WIAS (Berlin)Faster algorithms for (regularized) optimal transport Abstract [Slides]
11:30 - 11:50   Alberto Bietti, Inria (Grenoble)Invariance, Stability, and Complexity of Deep Convolutional Representations Abstract [Slides]
The success of deep convolutional architectures is often attributed in part to their ability to learn multiscale and invariant representations of natural signals. However, a precise study of these properties and how they affect learning guarantees is still missing. In this work, we consider deep convolutional representations of signals; we study their invariance to translations and to more general groups of transformations, their stability to the action of diffeomorphisms, and their ability to preserve signal information. This analysis is carried by introducing a multilayer kernel based on convolutional kernel networks and by studying the geometry induced by the kernel mapping. We then characterize the corresponding reproducing kernel Hilbert space (RKHS), showing that it contains a large class of convolutional neural networks with homogeneous activation functions. This analysis allows us to separate data representation from learning, and to provide a canonical measure of model complexity, the RKHS norm, which controls both stability and generalization of any learned model. In addition to models in the constructed RKHS, our stability analysis also applies to convolutional networks with generic activations such as rectified linear units, and we discuss its relationship with recent generalization bounds based on spectral norms.

Venue: Salle 2 (first floor), IMAG Building on the campus

Sponsors: this workshop is funded by the Grenoble Data Institute, maimosine, the Persyval project "persyvact 2", the PGMO project "optimization methods for stochastic programing", the PEPS CNRS "Intelligence artificielle et apprentissage automatique" Optimyst and with a gift from the CNRS GdR MOA.

Organizers: Anatoli Iouditski, Jerome Malick, Dmitry Grishchenko (LJK, Grenoble) and Alexander Gasnikov (MIPT, Moscow), with the help of Suzan Tayar and Juana Dos Santos (LJK)