13:00  15:00  Get together at LJK picnic  
15:00  15:40  Jerome Malick, CNRS (LJK, Grenoble)  Mirorstratifiable regularizers Abstract [Slides] 
Lowcomplexity nonsmooth convex regularizers are routinely used to impose some structure (such as sparsity or lowrank)
on solutions to optimization problems arising in machine learning and image processing.
In this talk, I present a set of sensitivity analysis and activity identification results
for a class of regularizers with a strong geometric structure, called ``mirrorstratifiable''.
This notion brings into play a pair of primaldual models, which in turn allows one to locate the structure of
the solution using a specific dual certificate. This pairing is crucial to track the substructures
that are identifiable by solutions of parametrized optimization problems or by iterates of firstorder proximal algorithms.
We also briefly discuss its consequences in model consistency in supervised learning.
 
15:40  16:00  Marina Danilova (Moscow)  The nonmonotonocity effect of accelerated optimization methods Abstract [Slides] 
TBA
 
16:00  16:20  Alexander Titov (Moscow)  Universal Proximal Method for Variational Inequalities Abstract [Slides] 
TBA
 
16:20  16:50  coffee break  
16:50  17:30  Alexander Gasnikov, MIPT (Moscow)  Accelerated method with the model function conception Abstract [Slides] 
TBA
 
17:30  17:50  Dmitry Grishchenko, LJK (Grenoble)  Automatic Dimension Reduction Proximal Algorithm Abstract [Slides] 
Optimization problems in machine learning and signal processing applications often involve objective functions made of a smooth datafidelity term plus nonsmooth convex regularization term.
Proximal algorithms are popular methods for solving such composite optimization problems.
Most of the popular regularizers have in fact a strong geometric structure, which implies specific identification properties of proximal algorithms;
more precisely, the convergence towards an optimal solution is such that after a finite number of iterations,
all iterate lies in the lowcomplexity subspace related to the one of the optimal solution.
In this work, we present a random proximalgradient algorithm that uses this identification property to automatic reduce of the numerical cost of solving special composite problems,
for instance when L1 and TV regularizers are used.
Our algorithm naturally extends to a distributed optimization setup, where a mastermachine combines all the results computed in parallel by slavemachines.
In this case, our proximal algorithm leverage on identification to gain in both faster gradient computations and lower cost of communications between machines.
 
from 20:00  workshop dinner 
09:00  09:40  Roland Hildebrand (Grenoble)  Scaling points and reach for nonselfscaled barriers Abstract [Slides] 
The theory of interiorpoint algorithms is most developed for the class of symmetric cones, which include the orthant, the Lorentz cone, and the semidefinite matrix cone, leading to the classes of linear programs, secondorder cone programs, and semidefinite programs, respectively. Its success relies on the property of these cones to possess a selfscaled barrier. In particular, for every primaldual pair of points in the product of the cone with its dual, there exists a socalled scaling point, which is used by primaldual interiorpoint algorithms on symmetric cones to compute a descent direction for the next iteration.
In this talk, we give a geometric interpretation of the scaling point in terms of an orthogonal projection onto a Lagrangian submanifold in the primaldual product space. This approach allows to extend the notion of scaling point to arbitrary selfconcordant barriers and convex cones. We show that there exists a tube of primaldual pairs of points of positive thickness around the submanifold for which scaling points exist. In geometric terms this means that the submanifold has positive reach. The thickness of the tube is a monotone function of the barrier parameter.
 
09:40  10:00  Sergey Guminov (Moscow)  Universal LineSearch Method Abstract [Slides] 
TBA
 
10:00  10:20  Darina Dvinskikh (Moscow)  Distributed computation of Wasserstein barycenter Abstract [Slides] 
TBA
 
10:20  10:50  coffee break  
10:50  11:30  Pavel Dvurechensky, WIAS (Berlin)  Faster algorithms for (regularized) optimal transport Abstract [Slides] 
TBA
 
11:30  11:50  Alberto Bietti, Inria (Grenoble)  Invariance, Stability, and Complexity of Deep Convolutional Representations Abstract [Slides] 
The success of deep convolutional architectures is often attributed in part to their ability to learn multiscale and invariant representations of natural signals. However, a precise study of these properties and how they affect learning guarantees is still missing. In this work, we consider deep convolutional representations of signals; we study their invariance to translations and to more general groups of transformations, their stability to the action of diffeomorphisms, and their ability to preserve signal information. This analysis is carried by introducing a multilayer kernel based on convolutional kernel networks and by studying the geometry induced by the kernel mapping. We then characterize the corresponding reproducing kernel Hilbert space (RKHS), showing that it contains a large class of convolutional neural networks with homogeneous activation functions. This analysis allows us to separate data representation from learning, and to provide a canonical measure of model complexity, the RKHS norm, which controls both stability and generalization of any learned model. In addition to models in the constructed RKHS, our stability analysis also applies to convolutional networks with generic activations such as rectified linear units, and we discuss its relationship with recent generalization bounds based on spectral norms.
