统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)

统计学习分类

基本分类

监督学习 无监督学习 强化学习 有时还包括 半监督学习 主动学习
····························································································
监督学习
根据已有的数据集,知道输入和输出结果之间的关系。根据这种已知的关系,训练得到一个最优的模型。
通俗一点,可以把机器学习理解为我们教机器如何做事情。

监督学习分为学习和预测两个过程,由学习系统与预测系统完成:
统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)
监督学习的分类:回归(Regression)、分类(Classification)

常用的算法例子:
统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)

··········································································································································································

无监督学习
我们不知道数据集中数据、特征之间的关系,而是要根据聚类或一定的模型得到数据之间的关系。
可以这么说,比起监督学习,无监督学习更像是自学,让机器学会自己做事情,是没有标签(label)的。
无监督学习中,类似分类和回归中的目标变量事先并不存在
要回答的问题是“从数据X中能发现什么”。

在学习过程中,学习系统从训练数据集学习,得到一个最优模型,表示为函数 Z=g(x) 条件概率分布P(zlx) 或者条件概率分布P(xlz) 。在预测过程中,预测系统对于给定的输入 XN+l 由模型 ZN+l = g(XN+l) 或 ZN+l = arg =maxP(ZIXN +1)
给出相应的输出 ZN+l 进行聚类或降维,或者由模型P(xlz) 给出输入的概率P(xN+1 IZN+1) 进行概率估计。
统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)

常用的算法例子:
统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)

··········································································································································································

强化学习
强化学习 reinforcement learning) 是指智能系统在与环境的连续互动中学习最优行为策略的机器学习问题。假设智能系统与环境的互动基于马尔可夫决策过程(Markov decision process) ,智能系统能观测到的是与环境互动得到的数据序列强化学习的本质是学习最优的序贯决策。

算法没有监督标签。只会对当前状态进行奖惩和打分,其本身并不知道什么样的动作才是最好的。先行动起来,如果方向正确那么就继续前行,如果错了从头再来。
从自身的以往经验中去不断学习来获取知识,从而不需要大量已标记的确定标签,只需要一个评价行为好坏的奖惩机制进行反馈,强化学习通过这样的反馈自己进行“学习”。(当前行为“好”以后就多往这个方向发展,如果“坏”就尽量避免这样的行为,即不是直接得到了标签,而是自己在实际中总结得到的)

强化学习过程中,系统不断地试错 (trial and error) ,以达到学习最优策略的目的。
强化学习的马尔可夫决策过程是状态、奖励、动作序列上的随机过程,由五元组<s,a,p,r,y>组成
1.s有限状态: 有限状态 state 集合,s 表示某个特定状态
2.a有限动作:有限动作 action 集合,a 表示某个特定动作
3.p转移概率: 根据当前状态 s 和动作 a 预测下一个状态 s’,这里的 P_r 表示从 s 采取行动 a 转移到 s’ 的概率统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)

4.r奖励函数:Reward :表示 采取某个动作后的奖励,它还有 R(s, a, s’), R(s) 等表现形式,采用不同的形式,其意义略有不同(有时奖励不是即时的)统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)

5.γ衰减系数:: 根据当前 state 来产生 action,表示某种状态下执行某个动作的概率

策略Π定义为当下动作的一个函数或一个条件概率分别。给定一个策略Π,智能系统与环境互动的行为就己确定(或者是确定性的或者是随机性的)。

价值函数(value function) 或状态价值函数 (state value function) 定义为策略 Π 从某个状态 s 开始的长期累积奖励的数学期望:
统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)
动作价值函数 (action value function) 定义为策略Π的从某个状态和动作a开始的长期累积奖励的数学期望:
统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)
强化学习有两类:无模型方法,有模型方法。
无模型方法:有基于策略或基于价值选择优化路径的。。学习通常从一个具体策略(一个动作函数或一个条件概率部分)开始,通过搜索更优的策略进行。
有模型方法:直接学习马尔可夫决策过程的模型,包括转移概率函数P(s’ls,α) 和奖励函数 r(s,α) 。这样可以通过模型对环境的反馈进行预测,求出价值函数最大的策略Π*。
··········································································································································································
··········································································································································································

按模型进行分类

1.概率模型与非概率模型
在监督学习中,概率模型是生成模型,非概率模型是判别模型。
概率模型为条件分布形式、非概率模型为函数形式。条件概率分布和函数形式可以互相转化,条件概率分布最大化后可得函数形式,函数形式归一化后可得概率分布形式。概率模型和非概率模型的区别不在于输入与输出之间的映射,而在于模型的内部结构
概率模型:决策树、朴素贝叶斯、隐马尔可夫模型、条件随机场、概率潜在语义分析、潜在狄利克雷分配、高斯温合模型
非概率模型感知机、支持向量机、k近邻、 AdaBoost 、k均值、潜在语义分析,以及神经网络
logis回归即可看作概率模型也可看做非概率模型。

2.线性模型和非线性模型

3参数化模型和非参数化模型
参数化模型假设模型参数的维度固定,模型可以由有限维参数完全刻画。非参数化模型假设模型参数的维度不固定或者说无穷大,随着训练数据量的增加而不断增大。
··········································································································································································
··········································································································································································

按算法进行分类

在线学习和批量学习
在线学习是指每次接受一个样本,进行预测,之后学习模型,并不断重复该操作的机器学习。与之对应,批量学习一次接受所有数据,学习模型,之后进行预测。有些实际应用的场景要求学习必须是在线的。

在线学习通常比批量学习更难,很难学到预测准确率更高的模型,因为每次模型更新中,可利用的数据有限。
··········································································································································································
··········································································································································································

按技巧分类

1.贝叶斯学习
2.核方法
把线性模型扩展到非线性模型,直接的做法是显式地定义从输入空间(低维空间)到特征空间(高维空间)的映射,在特征空间中进行内积计算 ,比如支持向量机,把输入空间的线性不可分问题转化为特征空间的线性可分问题。核方法的技巧在于不显式地定义这个映射,而是直接在输入空间定义核函数,即映射之后在特征空间的内积。这样可以简化计算,达到同样的效果。
··········································································································································································
··········································································································································································
··········································································································································································

统计学习方法三要素

模型的假设空间、模型选择的准则以及模型学习的算法,简称为模型 Cmodel)、策略 Cstrategy) 和算法 Calgorithm)

实现统计学习方法的步骤如下:

(1)得到一个有限的训练、数据集合:
(2) 确定包含所有可能的模型的假设空间,即学习模型的集合;
(3) 确定模型选择的准则,即学习的策略;
(4) 实现求解最优模型的算法,即学习的算法:
(5) 通过学习方法选择最优模型:
(6) 利用学习的最优模型对新数据进行预测或分析。
···································································································································································
···································································································································································

模型

在监督学习过程中,模型就是所要学习的条件概率分布或决策函数。模型的假设空间 (hypothesis space) 包含所有可
能的条件概率分布或决策函数
例如,假设决策函数是输入变量的线性函数,那么模型的假设空间就是所有这些线性函数构成的函数集合。’
···································································································································································
···································································································································································

策略

损失函数和风险函数
监督学习中有时输出的预测值 f(x)有时与真实值 Y 不一致。用损失函数度量预测错误的程度

几种常用的损失函数:
统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)
统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)

统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)
统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)
·
·
由于模型的输入、输出 (X, Y) 是随机变量,遵循联合分布 P(X ,Y)损失函数的期望:(联合分布一般未知,不能直接计算)
统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)

给定一个训练数据集T={(x1,y1)(x2,y2)…(xn,yn)},模型 f(x) 关于训练数据集的的平均损失称为经验风险或经验损失函数
统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)

期望风险Rexp (f)是模型关于联合分布的期望损失,经验风险 Remp (f) 是模型关于训练样本集的平均损失。根据大数定律 当样本容量 趋于无穷时,经验风险Remp (f) 趋于期望风险Rexp (f)
···································································································································································
经验风险最小化与结构风险最小化
在假设空间、损失函数以及训练数据集确定的情况下,经验风险函数式就可以确定。经验风险最小化的策略认为,经验
风险最小的模型是最优的模型。根据这一策略,按照经验风险最小化求最优模型就是求解最优化问题:
统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)
当样本容量很小时,经验风险最小化学习的效果就未必很好,会产生**“过拟合”** 现象

结构风险最小化是是为了防止过拟合而提出来的策略。结构风险最小化等价于正则化。。结构风险在经验风险上加
上表示模型复杂度的正则化项或罚项
结构风险的定义是:
统计学习方法学习笔记(一)统计学习方法的分类和简介,统计学习方法三要素(模型,策略和算法)
J(f) 表示模型的复杂度,模型越复杂,复杂度越大。复杂度表示了对复杂模型的惩罚。

算法

如果最优化问题有显式的解析式,算法比较简单

但通常解析式不存在,就需要数值计算的方法

····································································································································································································································································································································································································································································································································································
统计学习方法之间的不同,主要来自其模型、策略、算法的不同。确定了模型、策略、算法,统计学习的方法也就确定了。