# Байесовское мультимоделирование (лекции, О.Ю. Бахтеев, В.В. Стрижов)/Осень 2021

(Различия между версиями)
Перейти к: навигация, поиск
 Версия 12:04, 7 сентября 2021 (править) (Новая: ==Bayesian model selection and multimodeling== The lecture course delivers the main problem of machine learning, the problem of model selection. One can set a heuristic model and optimi...)← К предыдущему изменению Версия 12:04, 7 сентября 2021 (править) (отменить)К следующему изменению → Строка 1: Строка 1: + __NOTOC__ ==Bayesian model selection and multimodeling== ==Bayesian model selection and multimodeling==

## Bayesian model selection and multimodeling

The lecture course delivers the main problem of machine learning, the problem of model selection. One can set a heuristic model and optimise its parameters, or select a model from a class, or make a teacher model to transform its knowledge to a student model, or even make an ensemble from a models. Behind all these strategies there is a fundamental technique: the Bayesian inference. It assumes hypotheses about the measured data set, about the model parameters and even about the model structure. And it deduce the error function to optimise. This is called the Minimum Description Length principle. It selects simple, stable and precise models. This course joins the theory and the practical lab works of the model selection and multimodeling.

Active participation 2 points, two lab works 3+3 points, questions during lectures 2 points, final exam 1 point.

## Syllabus

1. 8.09 Intro
2. 15.09 Distributions, expectation, likelihood
3. 22.09 Bayesian inference
4. 29.09 MDL, Minimum description length principle
5. 6.10 Probabilistic metric spaces
6. 13.10 Generative and discriminative models
7. 20.10 Data generation, VAE, GAN
8. 27.10 Probabilistic graphical models
9. 3.11 Variational inference
10. 10.11 Variational inference 2
11. 17.11 Hyperparameter optimization
12. 24.11 Meta-optimization
13. 1.12 Bayesian PCA, GLM and NN
14. 8.12 Gaussian processes

## Lab works

The parameter space $\mathbb{R}^2\ni\mathbf{w}=[w_1, w_2]\T$ is shown by $x,y$-axes. A function of the parameters, for example, $p(\bw)$ or~$\mathcal{L}(\mathbf{w})$ is shown by $z$-axis. The variance of some functions is shown by an opaque surface over~$z$-axis.

### Lab work 0

Plot the stochastic gradient descent vectors and the result average. Here the link to the code.

### Lab work 19

Investigate the data space, plot the data distribution, the source and the target variable.

### Lab work 20

Plot the empirical distribution of the model parameters for various data generation hypothesis and various regions of the data space.