吴恩达机器学习 笔记整理 Chapter2 单变量线性回归
吴恩达机器学习 笔记整理 Chapter 2 单变量线性回归 Linear regression with one variable
Main Procedure
hypothesis
a function that maps input to output
Univariate linear regression
Cost function
Idea
Choose so that is close to y for our training examples (x,y).
To fit the function to training data => To minimized the cost function
To minimize the square difference between the hypothesis and the actual value
Cost function for Univariate linear regression
Gradient Descent
An algorithm for minimizing the cost function.
Intuition: contour plot for
To minimize as quickly as possible
Outline
- Start with some , (Random Initialize)
- Keep changing , to reduce until end up at minimum (Although sensitive of local mini, usually get global mini)
Algorithm: To minimize as quickly as possible
Repeat util convergence{
}
NOTE: All the parameter should be update simultaneously.
: learning rate
Need to choose carefully.
- Too small: gradient descent can be slow
- Too large: it can be overshoot the minimum. It may be fail to converge, or even diverge
Gradient descent can converge to a local minimum, even with learning rate a fixed
As we approach a local minimum, gradient descent will automatically take smaller steps(because the partial derivative will become smaller). So there is no need to decrease overtime.
Gradient Descent for Univariate Linear Regression
Problem: susceptible to local optima
It’s okay because the cost function for linear regression is a convex function
Batch Gradient Descent
Each step uses all the training examples.