分类Global-View

Fine-Grained recognition

  • require recognition of highly localized attributes of objects while being invariant to their pose and location in the image
  • Part-based models
    • construct representations by localizing parts and extracting features conditioned on their detected locations
  • Holistic models
    • onstruct a representation of the entire image directly. These include classical image representations

1. Bilinear CNNs for Fine-grained Visual Recognition

Abstract

  • These networks represent an image as a pooled outer product of features derived from two CNNs and capture localized feature interactions in a translationally invariant manner.

Key insight

  • several widely-used texture representations can be written as a pooled outer product of two suitably designed features

Architecture

分类Global-View