Showing 217 to 220

Ensemble learning

Bayesian methods

Bayesian model averaging, or ensemble methods, is the art of fitting multiple models and then utilizing all of them to form a better prediction. It's not as dumb as just taking the average prediction.

Evaluating the prediction of an ensemble typically requires more computation than evaluating the prediction of a single model. In one sense, ensemble learning may be thought of as a way to compensate for poor learning algorithms by performing a lot of extra computation. On the other hand, the alternative is to do a lot more learning on one non-ensemble system. An ensemble system may be more efficient at improving overall accuracy for the same increase in compute, storage, or communication resources by using that increase on two or more methods, than would have been improved by increasing resource use for a single method.

Empirically, ensembles tend to yield better results when there is a significant diversity among the models.[5][6] Many ensemble methods, therefore, seek to promote diversity among the models they combine.[7][8] Although perhaps non-intuitive, more random algorithms (like random decision trees) can be used to produce a stronger ensemble than very deliberate algorithms (like entropy-reducing decision trees).[9] Using a variety of strong learning algorithms, however, has been shown to be more effective than using techniques that attempt to dumb-down the models in order to promote diversity.[10]

Created (3 years ago)

Log-posterior

Bayesian methods, #statistics

What's a log-posterior (all the lp__ output from Stan)? It's a number piled on by every variable in the model, and the higher it is, the more likely that the model is good.

Break it down. First, it's similar to the frequentist log-likelihood (Likelihood function), and the log transformation is a convenience thing:

The log likelihood tells you nothing you can't get from the likelihood, but if observations are independent, it is additive. That's often an advantage when you want to differentiate to find a maximum. The reason for logging a posterior is because it is derived partly from the likelihood which, as mentioned above, is additive.

When one compares two models in a Bayesian setup, one can take the ratio of the posteriors. This can be interpreted as the odds for one of the models over the other. If we take the log of the ratio, we get the difference of the log-posteriors. Thus, the log-posterior can be used in model comparison.

In comparison to Frequentist methods, this is not unlike the likelihood ratio tests for model comparisons. The advantages in the Bayesian setting are two: 1) our models do not have to be nested, as they do in the likelihood ratio test; 2) the distribution of the likelihood ratio test is only known asymptotically; the difference in the log-posteriors gives us the distribution for whatever sample size we have.

Created (3 years ago)

Bayes classifier

Bayesian methods

A machine learning thing. Related to Naive Bayes classifier.

It's one way to approach a classification problem.

BTW, terminology is varied. The term classifier can be meant as:

  1. a mathematical function that maps input data to a category
  2. an algorithm that implements classification, esp. in a concrete implementation

In stats, classification is often done via logistic regression, speaking of explanatory/independent vars, and the possible categories are termed outcomes. In ML, observations are often known as instances, the explanatory vars as features, and the possible categories are termed classes.

Created (3 years ago)
Showing 217 to 220