Boosting 简介--A (small) introduction to Boosting

A (small) introduction to Boosting
https://codesachin.wordpress.com/tag/adaboost/

这里翻译了一下这篇博客,对 boosting 介绍的很好

What is Boosting? 什么是 Boosting
Boosting is a machine learning meta-algorithm that aims to iteratively build an ensemble of weak learners, in an attempt to generate a strong overall model.
下面分别对上面的定义进行逐词解析
1)weak learners 弱学习器
A ‘weak learner’ is any ML algorithm (for regression/classification) that provides an accuracy slightly better than random guessing.
就是比随机猜测好一点的一个机器学习算法。随机猜测的准确率是 50%,所以任何一个算法其准确率超过50% 都是一个弱学习器。
常用的弱学习器有 Decision Stumps or smaller Decision Trees

2) Ensemble
Boosting 构建的模型最终的输出就是 所以弱学习器的 权重和
The overall model built by Boosting is a weighted sum of all of the weak learners. The weights and training given to each ensures that the overall model yields a pretty high accuracy (sometimes state-of-the-art)

3) Iteratively build 许多组合方法如 bagging/random forests ,这些模型中的弱学习器都可以并行独立训练的,因为这些弱学习器之间没有依赖性。但是 Boosting 不是这样的。 在每个步骤, Boosting 尝试评估当前已构建的模型 shortcomings ,然后生成一个 弱学习器来解决这个 shortcomings ,然后将这个弱学习器加到总体模型中去。所以整个训练过程是序列进行的。

4) Meta-algorithm
因为 Boosting 本身不是一个机器学习算法,它只是将一些基础算法构建成一个强算法,所以说它是 ‘meta’algorithm
Since Boosting isn’t necessarily an ML algorithm by itself, but rather uses other (basic) algorithms to build a stronger one, it is said to be a ‘meta’ algorithm.

How does Boosting work?
Boosting 简介--A (small) introduction to Boosting

通常一个基于 Boosting 框架的回归算法工作流程如下:
Boosting 简介--A (small) introduction to Boosting
在 Boosting 的每个迭代步骤中,通过引入一个新的弱学习器到当前的 ensemble 中来提升当前模型的性能,这个新引入的弱学习器主要负责解决当前模型不能解决的那些样本。 这个ensemble 不仅减少 bias 也同样降低 variance

Each of the iterations in Boosting essentially tries to ‘improve’ the current model by introducing another learner into the ensemble. Having such an ensemble not only reduces the bias (which is generally pretty high for weak learners), but also the variance (since multiple learners contribute to the overall output, each with their own unique training).

Boosting 有很多种版本,其差别主要在上面算法步骤中的一些细节上。
例如 Gradient Boosting 主要的思路是 计算 Loss function 在当前步骤某一 data point 的梯度 gradient ,然后用一个新的弱学习器来学习预测这个梯度 gradient, 这个弱学习器的权重通过最小化损失函数值得到 The weight is then optimized so as to minimize the total Loss value

Boosting 简介--A (small) introduction to Boosting

11