CV之FRec之Loss:人脸识别中常用的几种损失简介、使用方法之详细攻略
CV之FRec之Loss:人脸识别中常用的几种损失简介、使用方法之详细攻略
T1、Triplet Loss
《FaceNet: A Unified Embedding for Face Recognition and Clustering》
https://arxiv.org/pdf/1503.03832.pdf
http://www.goodtimesweb.org/surveillance/2015/1503.03832v1.pdf
1、英文原文解释
Triplet Loss The embedding is represented by f(x) ∈ R d . It embeds an image x into a d-dimensional Euclidean space. Additionally, we constrain this embedding to live on the d-dimensional hypersphere, i.e. kf(x)k2 = 1. This loss is motivated in [19] in the context of nearest-neighbor classifi- cation. Here we want to ensure that an image x a i (anchor) of a specific person is closer to all other images x p i (positive) of the same person than it is to any image x n i (negative) of any other person. This is visualized in Figure 3. Thus we want, kx a i − x p i k 2 2 + α < kx a i − x n i k 2 2 , ∀ (x a i , x p i , xn i ) ∈ T , (1)
where α is a margin that is enforced between positive and negative pairs. T is the set of all possible triplets in the training set and has cardinality N. The loss that is being minimized is then L = X N i h kf(x a i ) − f(x p i )k 2 2 − kf(x a i ) − f(x n i )k 2 2 + α i + . (2) Generating all possible triplets would result in many triplets that are easily satisfied (i.e. fulfill the constraint in Eq. (1)). These triplets would not contribute to the training and result in slower convergence, as they would still be passed through the network. It is crucial to select hard triplets, that are active and can therefore contribute to improving the model. The following section talks about the different approaches we use for the triplet selection.
2、代码实现
triplet_loss
(anchor, positive, negative, alpha): #(随机选取的人脸样本的特征,anchor的正、负样本的特征)
#它们的形状都是(batch_size,feature_size),feature_size是网络学习的人脸特征的维数
"""Calculate the triplet loss according to the FaceNet paper
with tf.variable_scope('triplet_loss'):
pos_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, positive)), 1)#pos_dist就是anchor到各自正样本之间的距离
neg_dist = tf.reduce_sum(tf.square(tf.subtract(anchor, negative)), 1)#neg_dist是anchor到负样本的距离
basic_loss = tf.add(tf.subtract(pos_dist,neg_dist), alpha)#用pos_dist减去neg_dist再加上一个alpha,最终损失只计算大于0的部分
loss = tf.reduce_mean(tf.maximum(basic_loss, 0.0), 0)
T2、Center loss
《A Discriminative Feature Learning Approach for Deep Face Recognition》
http://ydwen.github.io/papers/WenECCV16.pdf
1、英文原文解释
The Center Loss So, how to develop an effective loss function to improve the discriminative power of the deeply learned features? Intuitively, minimizing the intra-class variations while keeping the features of different classes separable is the key. To this end, we propose the center loss function, as formulated in Eq. 2. LC = 1 2 m i=1 xi − cyi 2 2 (2)
The cyi ∈ Rd denotes the yith class center of deep features. The formulation effectively characterizes the intra-class variations. Ideally, the cyi should be updated as the deep features changed. In other words, we need to take the entire training set into account and average the features of every class in each iteration, which is inefficient even impractical. Therefore, the center loss can not be used directly. This is possibly the reason that such a center loss has never been used in CNNs until now. To address this problem, we make two necessary modifications. First, instead of updating the centers with respect to the entire training set, we perform the update based on mini-batch. In each iteration, the centers are computed by averaging the features of the corresponding classes (In this case, some of the centers may not update). Second, to avoid large perturbations caused by few mislabelled samples, we use a scalar α to control the learning rate of the centers.
2、代码实现
center_loss
features, label, alfa, nrof_classes
#features是样本的特征,形状为(batch size,feature size)
nrof_features = features.get_shape()[1] #nrof_features就是feature_size ,即神经网络计算人脸的维数
#centers为变量,它是各个类别对应的类别中心
centers = tf.get_variable('centers', [nrof_classes, nrof_features], dtype=tf.float32,
initializer=tf.constant_initializer(0), trainable=False)
label = tf.reshape(label, [-1])
centers_batch = tf.gather(centers, label) #根据label,取出features中每一个样本对应的类别中心
#centers_batch应该和features的形状一致,为(batch size,feature size)
diff = (1 - alfa) * (centers_batch - features) #计算类别中心和各个样本特征的差距diff,diff用来更新各个类别中心的位置,计算diff时用到的alfa是一个超参数,它可以控制中心位置的更新幅度
centers = tf.scatter_sub(centers, label, diff) #diff来重新中心
loss = tf.reduce_mean(tf.square(features - centers_batch)) #计算loss
return loss, centers #返回loss和更新后的中心