Machine Learning week 4 ex3 program 错误整理未完待续
第一次时在填写
lrCostFunction.m时
A = X * theta;
J = -1/m * (y' * log(sigmoid(A)) + (1 - y')* log(1-sigmoid(A)));grad = 1/m * X' * (log(sigmoid(A) - y));
temp = theta;
temp(1) = 0;
J = J + lambda / (2*m) * (temp.^2);
grad = grad + lambda /(2*m) * temp;
% may be can use
% J = J + lambda/(2*m) * sum(theta(2:end).^2);
% B = zeros(length(theta),1);
% theta(1,:) = B (1,:);
% grad = grad + lambda /(2*m) * temp;
运行ex3.m 时虽然没有报错,但是明显与expectd cost相差甚远
修改后
A = X * theta;
J = -1/m * (y' * log(sigmoid(A)) + (1 - y')* log(1-sigmoid(A)));grad = 1/m * X' * (sigmoid(A) - y);
% 上面的程序在计算grad时 h(z)错误
temp = theta;
temp(1) = 0;
J = J + lambda / (2*m) * sum(temp.^2);
grad = grad + lambda /(2*m) * temp;
% 在计算新的J时未使用sum,此处理论上应该还可以用temp’* temp来计算。
对比求得的Gradients 和 Expected gradients发现除了j= 0 时其他的值皆存在2倍关系,
故 gradient代码应该修改为
grad = grad + lambda /m * temp;
测试第二种方法
A = X * theta;
J = -1/m * (y' * log(sigmoid(A)) + (1 - y')* log(1-sigmoid(A)));grad = 1/m * X' * (sigmoid(A) - y);
% temp = theta;
% temp(1) = 0;
% J = J + lambda / (2*m) * sum(temp.^2);
% grad = grad + lambda /m * temp;
% may be can use
J = J + lambda/(2*m) * sum(theta(2:end).^2);
B = zeros(length(theta),1);
theta(1,:) = B (1,:);
grad = grad + lambda /m * theta;
所计算得到的值与预测值相吻合,符合要求