Babysitting the Learning Process

在预处理完data和选择合适的network architecture后就开始train了,

1. Double check that the loss is reasonable:最开始disable regularization的话loss应该是与1/# class相关的一个数,

2。 Make sure that you can overfit very small portion of the training data

3、Start with small  regularization and find  learning rate that  makes the loss go  down.

Babysitting the Learning Process

cost: NaN almost  always means high  learning rate...


lr是最要的超参数,一般采用coarse -> fine cross-validation in stages

Babysitting the Learning Process

Babysitting the Learning Process


4、训练过程中Monitor and visualize the loss curve & accuracy

Babysitting the Learning Process

Babysitting the Learning Process