# Байесовское мультимоделирование (лекции, О.Ю. Бахтеев, В.В. Стрижов)/Осень 2021

Перейти к: навигация, поиск

## Bayesian model selection and multimodeling

The lecture course delivers the main problem of machine learning, the problem of model selection. One can set a heuristic model and optimise its parameters, or select a model from a class, or make a teacher model to transform its knowledge to a student model, or even make an ensemble from a models. Behind all these strategies there is a fundamental technique: the Bayesian inference. It assumes hypotheses about the measured data set, about the model parameters and even about the model structure. And it deduce the error function to optimise. This is called the Minimum Description Length principle. It selects simple, stable and precise models. This course joins the theory and the practical lab works of the model selection and multimodeling.

Active participation 2 points, two lab works 3+3 points, questions during lectures 2 points, final exam 1 point.

## Syllabus

1. 8.09 Intro
2. 15.09 Distributions, expectation, likelihood
3. 22.09 Bayesian inference
4. 29.09 MDL, Minimum description length principle
5. 6.10 Probabilistic metric spaces
6. 13.10 Generative and discriminative models
7. 20.10 Data generation, VAE, GAN
8. 27.10 Probabilistic graphical models
9. 3.11 Variational inference
10. 10.11 Variational inference 2
11. 17.11 Hyperparameter optimization
12. 24.11 Meta-optimization
13. 1.12 Bayesian PCA, GLM and NN
14. 8.12 Gaussian processes

## Lab works

The parameter space $\mathbb{R}^2\ni\mathbf{w}=[w_1, w_2]\T$ is shown by $x,y$-axes. A function of the parameters, for example, $p(\bw)$ or~$\mathcal{L}(\mathbf{w})$ is shown by $z$-axis. The variance of some functions is shown by an opaque surface over~$z$-axis.

### Lab work 0

Plot the stochastic gradient descent vectors and the result average. Here the link to the code.

### Lab work 19

Investigate the data space, plot the data distribution, the source and the target variable.

### Lab work 20

Plot the empirical distribution of the model parameters for various data generation hypothesis and various regions of the data space.

## References

### Books

1. Bishop
2. Barber
3. Murphy
4. Rasmussen and Williams, of course!
5. Taboga(to catch up)

### Theses

1. Грабововй А.В. Диссертация.
2. Бахтеев О.Ю.. Выбор моделей глубокого обучения субоптимальной сложности git, автореферат, презентация (PDF), видео. 2020. МФТИ.
3. Адуенко А.А. Выбор мультимоделей в задачах классификации, презентация (PDF), видео. 2017. МФТИ.
4. Кузьмин А.А. | Построение иерархических тематических моделей коллекций коротких текстов, | презентация (PDF), видео. 2017. МФТИ.

### Papers

1. Kuznetsov M.P., Tokmakova A.A., Strijov V.V. Analytic and stochastic methods of structure parameter estimation // Informatica, 2016, 27(3) : 607-624, PDF.
2. Bakhteev O.Y., Strijov V.V. Deep learning model selection of suboptimal complexity // Automation and Remote Control, 2018, 79(8) : 1474–1488, PDF.
3. Bakhteev O.Y., Strijov V.V. Comprehensive analysis of gradient-based hyperparameter optimization algorithmss // Annals of Operations Research, 2020 : 1-15, PDF.