Forrest Koch, UNSW
Dr Matias Quiroz, University of Technology Sydney
The course aims to give a solid introduction to the Bayesian approach to statistical inference. The course contains a mix of theoretical and methodological concepts, with an emphasis on computer implementation of modern Bayesian methods.
The course consists of four modules (see details below). The first module introduces the Bayesian paradigm and develops inferential tools for some simple models. The second module considers more advanced models, such as linear regression, spline regression and classification models. The third module focuses on computation and presents state-of-the-art algorithms to carry out Bayesian inference. Finally, the fourth module presents model comparison techniques and advanced topics such as Bayesian variable selection and hierarchical models.
Module 1: The basics.
The Bayesian paradigm. Single parameter models. Conjugate priors. Prior elicitation, Noninformative priors. Jeffreys’ prior. Multi parameter models. Bayesian computation via simulation. Analytic marginalisation. Marginalisation via simulation.
Module 2: Regression models.
Bayesian prediction. The Bayesian approach to making decisions. Choosing a Bayesian point estimator as a decision theory problem. Bayesian linear regression. Shrinkage priors. Bayesian spline regression. Asymptotics. Normal approximation. Bayesian classification. Generative models (naïve Bayes). Discriminative models (logistic regression).
Module 3: Bayesian computation.
More on Bayesian computations. Monte Carlo integration. Importance sampling. Inverse cdf method. Rejection sampling. Markov processes. The Gibbs sampler. Data augmentation. The Metropolis and Metropolis-Hastings sampler. Hamiltonian Monte Carlo proposals. Efficiency of simulation. Assessing convergence.
Module 4: Model inference and hierarchical models.
Bayesian model comparison. Marginal likelihoods. Bayesian model averaging. Bayesian variable selection. Posterior predictive model evaluation. Hierarchical models. Pooling estimates. MCMC sampling with RStan.
(may be subject to change)
A textbook that contains most of the material we will cover is
Bayesian Data Analysis (3rd edition) by Andrew Gelman, John Carlin, Hal Stern, David Dunson, Aki Vehtari, and Donald Rubin.
The book is freely available via the first author’s webpage:
Mattias Villani has a book-in-progress that covers some of the material in this subject: https://github.com/mattiasvillani/BayesianLearningBook. I recommend saving this link as I am confident that this will be one of the best gentle introductions to Bayesian statistics books out there when it is completed.
A good and compact text on some of the mathematical tools used in this subject can be found here: https://gwthomas.github.io/docs/math4ml.pdf. However, note that we will only encounter a subset of these, so students are not expected to know the whole content.
Each module has a computer lab in which the students should implement Bayesian procedures. The recommended programming language to solve the computer labs is R because the material presented will use it and the datasets provided will be in R format. However, students may use any software they want (e.g. Python or Julia).
It is recommended that students install R and an editor for writing R code. RStudio is an excellent choice and it is free for academic use. Students can obtain Rstudio from https://www.rstudio.com/. Moreover, one of the computer labs uses Stan, a platform for efficient statistical computation using Markov chain Monte Carlo. Stan is available on R (RStan) and is free. Installation requires a few steps but it is well documented, see e.g. https://mc-stan.org/users/interfaces/rstan. It is highly recommended that students install R, RStudio and RStan before the course starts.
Take this quiz and look at some of the expected foundational skills in this topic