Transductive Unbiased Embedding for Zero-Shot Learning阅读笔记
Transductive Unbiased Embedding for Zero-Shot Learning
Summary
PRO
- ubias term: 在Loss添加一个针对未知类的loss, 部分抑制了zero shot天生倾向于带label数据的问题
- 巧妙的数据利用,虽然target dataset没有用label(图片文字对应关系),但是用了label的文字embedding
- 对CNN fine tune的实验(根据数据集大小决定是否fine tune)
CON
- embedding的质量:
1.根据文章 attributes最好,word2vec次之
Zero shot learning
- relies on the semantic space to associate source and target classes
-
whether the unlabeled data of target classes are available for training
- inductive ZSL
- transductive ZSL
-
experimental settings:
- conventional settings: test images come solely from the target classes
- generalized settings: test images come not only from the target but also from the source classes
-
bias problem:
- bridging the visual and the semantic embeddings
- visual instances are usually projected to several fixed
anchor points specified by the source classes in the seman-
tic embedding space
3. QFSL model
Quasi-Fully Supervised Learning
3.0 symbols
3.1 Visual Embedding Subnet
- fc1: visual embedding
- fine tune(large data) or not(few data)
3.2 visual-semantic bridging subnet
- several fully connected layers
- optimized together with the visual embedding subnet.
3.3 Scoring Subnet
- inner product between the projected embedding and the normalized semantic embeddings as the scores
- implemented as a single fully connected layer
- the weights of the scoring subnet are frozen and will not be updated during the training phase(代表semantic embedding?)
- the weights are initialized with the normalized semantic vectors of both the source and the target classes
- semantically meaningful attributes are adopted as the
semantic space. - produces S + T scores for a given image
3.4 Optimization
- bias loss
- where is the predicted probability of class i
- encourages our model to increase the sum of probabilities of being any
target class