Paper Notes of CVPR-0313

HashGAN: Deep Learning to Hash with Pair Conditional Wasserstein GAN

Abstract

Deep learning to hash need similarity information that is expensive to collect, which often results in substantial loss of image retrieval quality. This paper presents HshGAN, a novel architecture for deep learning to hash, whose idea is to use nearly real images synthesized from a new Pair Conditional Wasserstein GAN (PC-WGAN) conditioned on the pairwise similarity information for data augmentation.

Introduction

This paper focus on building the data-dependent hash encoding schemes which perform better than ddata-independent methods for efficient image retrieval. Since deep learning to hash need large-scale image data and sufficient supervised information, which is not suitable on many image retrieval applications, the author propose PC-WGAN that can learn compact binary hash codes from both real and large-scale synthesized images. PC-WGAN is the first GAN that enables image synthesis by incorporating pairwise similarity information. It can be trained end-to-end by back-propagation in a minimax optimization mechanism.

Related work

The authors mainly present two parts related works: Hashing Methods and Generative Models, and propose the superiority of their method in the end.

Method

The architecture of HashGAN is shown in Figure 1.

Paper Notes of CVPR-0313

It include two parts: a pair conditional Wasserstein GAN (G and D) and a hash encoder F.

The optimization problems for discriminator D, generator G and hash encoder F are respectively computed as follows:

Paper Notes of CVPR-0313

More details about the HashGAN please refer to “HashGAN: Deep Learning to Hash with Pair Conditional Wasserstein GAN”

Experiment

The authors evaluate the efficacy of the proposed HashGAN approach with eight state-of-the-art shallow and deep hashing methods on three benchmark datasets. The MAP(Mean Average Precision) is shown in Table 1.

Paper Notes of CVPR-0313

The proposed HashGAN improves substantially by two perspectives: (1) HashGAN invite a novel Pair Conditional Wasserstein GAN (PC-WGAN) to synthesize nearly real images as training data which alleviate the problem of insufficient training data. (2) The model use a new loss function which can approximate the Hamming distance more accurately to learn nearly lossless hash codes.

Then the authors provide other important metrics(to this specific task) to verify the efficacy of their methods.

At the end of chapter 4, the ablation study and visualization study are shown.

Paper Notes of CVPR-0313

In above, HashGAN-B serves as the upper bound of performance; HashGAN-Q is the HashGAN variant without using the proposed quantization loss; HashGAN-C is the variant by replacing the proposed cosine cross-entropy loss with the widely-used inner-product cross-entropy loss; HashGAN-G is the variant without the proposed PC-WGAN.

Paper Notes of CVPR-0313

About the visualization study, some results are shown in Figure 5.

Conclusion

This paper propose a novel HashGAN which can synthesize nearly real images conditioned on the pairwise similarity information to alleviate the problem of insufficient of similarity information and improve the quality of compact binary hash codes.