机器学习系列之coursera week 2 Linear Regression with Multiple Variables

目录

1. Multiple Features

1.1 Multiple features

1.2 Gradient descent for multiple varibales

1.3 Gradient descent in practice I:Feature scaling

1.4 Gradient descent in practice II: Learning rate

1.5 Summary

1.6 Features and polynomial regression

2. Computing Parameters Analytically

2.1 Normal Equation

2.2 Normal Equation Nonivertibility


1. Multiple Features

1.1 Multiple features

Size

Number of bedrooms

Number of floors

Age of home

Price

2104

5

1

45

460

1416

3

2

40

232

1534

3

2

30

315

852

2

1

36

178

X1

X2

X3

X4

y

Notation:

n = number of features

x^(i) = ith training example

x^(i)_(j) = value of feature j in ith training example

E.g:

x^(2) = [1416; 3; 2; 40]

Hypothesis:

机器学习系列之coursera week 2 Linear Regression with Multiple Variables

vectorization:

令x0 = 1

机器学习系列之coursera week 2 Linear Regression with Multiple Variables

这就叫做多元线性回归

 

1.2 Gradient descent for multiple varibales

Hypothesis: 机器学习系列之coursera week 2 Linear Regression with Multiple Variables

parameters: 机器学习系列之coursera week 2 Linear Regression with Multiple Variables

cost function:

机器学习系列之coursera week 2 Linear Regression with Multiple Variables

Gradient descent:

机器学习系列之coursera week 2 Linear Regression with Multiple Variables

simultaneously updata for every j=0,1...n

机器学习系列之coursera week 2 Linear Regression with Multiple Variables

 

1.3 Gradient descent in practice I:Feature scaling

Feature scaling 特征缩放:

Idea: Make sure features are on a similar scalar

这样梯度下降就能收敛更快

E.g. x1 = size (0~2000)

        x2 = number of bedrooms (1~5)

机器学习系列之coursera week 2 Linear Regression with Multiple Variables

引自coursera machine learning week 2 Gradient descent in practice I: Feature scaling

feature scaling:

机器学习系列之coursera week 2 Linear Regression with Multiple Variables

More generally:

frature scaling:

get every feature into approximately a

机器学习系列之coursera week 2 Linear Regression with Multiple Variables

x0 = 1可以

0 <= x1 <= 3 非常接近 可以

-100 <= x2 <= 100 须scaling

-0.0001 <= x3 <= 0.0001 须scaling

一般地,一个特征在

-3 to 3

- 1/3 to 1/3 都是可以的

另一种缩放叫归一化

机器学习系列之coursera week 2 Linear Regression with Multiple Variables

 

1.4 Gradient descent in practice II: Learning rate

Gradient descent:

机器学习系列之coursera week 2 Linear Regression with Multiple Variables

- "Debuggung": How to make sure gradient descent is working correctly

- How to choose learning rate

 

making sure gradient descent is working correctly

plot J(θ) - No. of iterations

机器学习系列之coursera week 2 Linear Regression with Multiple Variables

引自coursera machine learning week 2 Gradient descent in practice II: Learning rate

J(θ) should decrease after every iteration

还能判断是否收敛

Example automatic converage test:

declare convergence if J(θ) decreases by less than 10^(-3) in one iteration

最好看图判断,因为阈值很难选取

如果收敛图是下面这样:

机器学习系列之coursera week 2 Linear Regression with Multiple Variables

引自coursera machine learning week 2 Gradient descent in practice II: Learning rate

- For sufficiently small α, J(θ) should decrease on every iteration ------ hold true for linear regression

- But if α is too small, gradient descent can be slow converge

 

1.5 Summary

- If  α is too small: slow convergence

- If  α is too large: J(θ) may not decrease on every iteration; may not converge. (slow converge also possible)

to choose  α, try:

机器学习系列之coursera week 2 Linear Regression with Multiple Variables

引自coursera machine learning week 2 Gradient descent in practice II: Learning rate

 

1.6 Features and polynomial regression

2. Computing Parameters Analytically

2.1 Normal Equation

A method to solve for θ analytically.(求解析解)

E.g.

J(θ) = aθ^2 + bθ +c, θ belongs to R

J(min) = J(-2a/b)

机器学习系列之coursera week 2 Linear Regression with Multiple Variables

引自coursera machine learning week 2: Normal Equation

 

Normal Equation:

机器学习系列之coursera week 2 Linear Regression with Multiple Variables

note: 不需要feature scaling

m training examples, n features

 

Gradient descent

Normal Equation

Need to choose α

No need to choose α

Need many iteration

Don’t need to iteration

works well even when n is large O(n^2)

Need to compute inv(a) O(n^3)

 

slow if n is very large

n > 10000, 开始用梯度下降

 

2.2 Normal Equation Nonivertibility

如果机器学习系列之coursera week 2 Linear Regression with Multiple Variables不可逆???

当1. redundant feature(linearly dependent)

E.g.

x1 = size in feet^2

x2 = size in m^2

而1m = 3.28feet

2. too many features(e.g. m<=n)

时会出现机器学习系列之coursera week 2 Linear Regression with Multiple Variables不可逆,实际上当且仅当X的行向量线性无关时,机器学习系列之coursera week 2 Linear Regression with Multiple Variables才可逆。

解决方法是删除redundant features, 多余features,or use regularization.

到这里, 第二周的内容已经复习完,其中1.6多项式回归笔者以后会通过专题补充。如果想更多的了解正规方程,请看传送门:MIT线性代数

https://www.bilibili.com/video/av6951511/?p=16