# Akaike Information Criterion

Reasons for information criteria can be found in a book Information Theory and Statistics by Kullback. Explains well.

True model y_{i} = 1 + x0.1 - x^{2} 0.2 …

Various models (hundreds, thousands) ∑ …

Choose model by best (smallest) AIC/BIC/DIC/WAIC.

AIC = D train + 2p

AIC is an approximation that is reliable only when: (1) The priors are flat or overwhelmed by the likelihood. (2) The posterior distribution is approximately multivariate Gaussian. (3) The sample size N is much greater than the number of parameters k.

## Watanabe-Akaike Information Criterion

Like AIC, you can rank models by WAIC. But a more interpretable measure is an Akaike weight. The weight for a model i in a set of m models is given by

$w_i = \frac{ \exp{\frac12 \textrm{dWAIC}_i} }{ \sum^m_{j=1} \exp{\frac12 \textrm{dWAIC}_j} }$

where dWAIC is the difference between each WAIC and the lowest WAIC, i.e. dWAIC = WAIC_{i} - WAIC_{min}.

## Leave-one-out cross-validation (LOO-CV)

New kid on the block, around 2020 it was the best (for which situations?).