Statistical Physics of Sparse and Dense Models in Optimization and Inference

Christian Schmidt

IPhT

Thu, Oct. 18th 2018, 14:00

Amphi Claude Bloch, Bât. 774, Orme des Merisiers

Datasets come in a variety of forms and from a broad range of different applications. Typically, the observed data is noisy or in some other way subject to randomness. The recent developments in machine learning have revived the need for exact theoretical limits of probabilistic methods that recover information from noisy data. \par In this thesis we are concerned with the following two questions. \\ What is the asymptotically best achievable performance? \\ And how can this performance be achieved, i.e., what is the optimal algorithmic strategy? \par The above questions can be studied in a probabilistic framework, which leads to an average (i.e. typical) case answer. Such a probabilistic formulation is natural to statistical physics and leads to a formal analogy with problems in disordered systems. In turn, this permits to harvest the methods developed in the study of disordered systems, to attack constraint satisfaction and statistical inference problems. \par I will present four contributions. First, a statistical physics investigation of the circular coloring problem is carried out that reveals several distinct features. Second, new rigorous upper bounds on the size of minimal contagious sets in random graphs, with bounded maximum degree, are obtained. Third, the phase diagram of the dense Dawid-Skene model is derived by mapping the problem onto low-rank matrix factorization. The associated approximate message passing algorithm is evaluated on synthetic and real-world data. Finally, we propose an approach to derive the Bayes optimal denoising mean square error for a restricted class of extensive rank matrix estimation problems.

Contact : Loic
BERVAS