Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks 阅读笔记
Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks
收获
- 多个网络如何组合? 相同目标collaborative, 不同目标adversarial
todo
- 查看groud truth class embedding
- 查看SAE
Conclusion
pro
- using adversarial framework to combine two network: Semantic transfer can be achieved
- an indepentdent visual-to-semantic mapping(tackling the semantic loss problem inherently in classification)
con
Overview
- problem: semantic loss
- some semantics would be discarded during training
- they are non-discriminative for traning classes
- critical for recognizing test classes
how
解决了分类网络不重要信息丢失的问题
1. Introduction
1.1 zero shot learning
- transferring knowledge from seen classes to unseen classes
- evolution
- primitive attribute classifier
- semantic embedding based framework
3. Model
- supervised Adeversial Autoencoder
- F encoder, G decoder
- F can be considered as the bottleneck layer, regularized to match supervised E
4. Implementation
4.1 Architecture
- E: resnet 101
- F: Alexnet + 2xfc
- leaky RELU: transform a vector into 3D feature map
- ground truth class embedding
4.2 Training details
- per-pixel mean subtraction
- fixed the Resnet-101 in E, initialized the AlexNet-like blocks in F with AlexNet and G with the pretrained generator
- learning rate started from 1e −4 and is multiplied by 0.1 when the error
is plateaus. - grid search to select parameter α and β.