机器学习系列2：几种常见的boosting

几种常见的boosting

Boosting

很多时候单一模型不够稳定或者得出的结果不够好，需要进行模型集成（assemble），即用多种模型进行预测。集成方法分为bagging和 boosting两种。以下介绍boosting.

You can view Boosting as a linear regression combination of many models

FmX=a0f0(X)+a1f1(X)+…+amfm(X) 机器学习系列2：几种常见的boosting

It is stage-wise optimized algorithm

Learn F0 机器学习系列2：几种常见的boosting , then F1 F2…

机器学习系列2：几种常见的boosting

Emphases error on each iteration

L(FmX,Y 机器学习系列2：几种常见的boosting )< L(Fm-1X,Y) L means loss function

ADA Boost

Emphases error by changing the distribution of samples.根据样本误差大小，分配权重。误差大的新权重大。

Gradient Boosting

Emphases error by changing train target.新的label是上一次预测的残差。

Gradient Boosting

Basic Function:

FX=m=0Mfm(X) 机器学习系列2：几种常见的boosting

f(X) is the base learner, and we use decision tree in GBDT

How to learn?

·Greedy way:

·FmX=Fm-1X+fm(X) 机器学习系列2：几种常见的boosting

·Let L(y, FmX 机器学习系列2：几种常见的boosting )<L(y, Fm-1X+fm)

·Gradient descent

·Get the negative gradient first

·ŷi=-ƏFm-1xil(Fm-1xi,yi) 机器学习系列2：几种常见的boosting

·Learn from fmX 机器学习系列2：几种常见的boosting to fit Ŷ by using L2 Loss

·fmX 机器学习系列2：几种常见的boosting =arg minf(X)i=1n(fxi-ŷi)^2

GBDT

GBDT=Gradient Boost + Decision Tree

Supported Tasks: Regression, Classification, Ranking

四、LightGBM

LightGBM是微软2017年开源的一种基于决策树的机器学习模型。LightGBM is a gradient boosting framework. It is designed to be distributed and efficient with following advantages:

~Fast training speed and high efficiency

~Lower memory usage

~Better accuracy

~Parallel learning supported

~Capacity of handling large-scaling data

~Support categorical feature directly

机器学习系列2：几种常见的boosting

相关推荐