Introduction

EdgeGPP is an improved framework for image classification, built upon the traditional Bag-of-Features (BoF) framework [Lazebnik, CVPR06]. It consists of three individual modules appended to the BoF model:
  • Extracting heterogeneous SIFT [Lowe, IJCV04] descriptors from both original and edge images to capture both texture and shape features.
  • Exploiting Geometric Phrase Pooling (GPP), a mid-level local pooling algorithm to enhance the visual phrase representation.
  • Blurring edgemap (intensity map in boundary detection [Canny, PAMI86]) for calculating spatial weights on the visual phrases.
It is verified in various image collections (Caltech101, Caltech256, Scene-15, UIUC-Sports, Indoor-67, etc.) that each of the modules contributes separately to provide discriminative image representation, and when combined together, we achieve excellent classification results with low computational overhead in both time and memory. Moreover, all the three modules are highly transplantable and therefore could be widely used in other classification tasks.

Frequently Asked Questions

  • What does the name "EdgeGPP" stand for?
  • EdgeGPP is the integration of all the three modules introduced above. In other words, "EdgeGPP" = "EdgeSIFT" + "GPP" + "EdgeWeighting".

  • How do I judge if I should use EdgeGPP for my image classification tasks?
  • EdgeGPP adds three individual modules onto the traditional BoF model. Although it works well in most situations, we still strongly recommend you to consider each module separately. Please see the following questions for a quick check, and refer to our paper for detailed explanations.

  • Does EdgeSIFT work well in all the classification tasks?
  • No. Since the edgemap is a grayscale intensity map, we could not extract color SIFT features on it. Therefore, if the classification task is highly related to the color information (e.g., Flower-17 or Flower-102 dataset, Birds-200 dataset), it is unadvisable to discard color features and use grayscale SIFT and EdgeSIFT.

  • What stage does GPP work on?
  • GPP works on the pooling stage of image classification. It is a local pooling algorithm, which is independent to both its predecessor step (coding) and successor step (global pooling or SPM).

  • Does GPP work well in all the classification tasks?
  • GPP does help improving the classification accuracy in ALL the tested datasets, including Caltech101, Caltech256, Oxford-Flower-17, Oxford-Flower-102, Caltech-UCSD-Bird-200 (2010, 2011), Stanford-Dog-120, PascalVOC (2007, 2010), Scene-15, UIUC-Sports, MIT-Indoor-67, etc.

  • Does EdgeWeighting work well in all the classification tasks?
  • Essentially speaking, EdgeWeighting provides a simple way of calculating saliency map by blurring the edgemap. It works well on all the tested classification tasks, though the improvement on classification accuracy is not significant in some situations.

  • If I want to use the codes for my experiments, what paper shall I cite?
  • Please cite our TIP paper. Of course, you can also cite the ACMMM paper if you want :)

Downloads

  • For detailed guidance for running the codes, please refer to the README file in each code package.
  • NEW! Version 2.0 would be released soon!
  • HOT! [RAR package][ZIP package] Version 1.0, released in December, 2012. The basic version including the codes of the complete EdgeGPP and an improved boundary detector (for edgemaps).
  • HOT! [RAR package][ZIP package] "Caltech4", a small dataset containing 4 categories (airplanes, faces, leopards, motorbikes) in Caltech101. We have extracted all the edgemaps for a quick check of the download classification algorithms. You should obtain higher than 95% accuracy (30 training) with any proper sets of settings.
  • If you find any mistakes or bugs in the codes or datasets, please contact me.

Related Publications

  • Lingxi Xie, Qi Tian and Bo Zhang, "Spatial Pooling of Heterogeneous Features for Image Applications", ACM International Conference on Multimedia (ACM-MM), Full Paper (Acceptance Rate = 20.2%), Oral Presentation, pages 539--548, Nara, Japan, 2012. [PDF] [Poster] [Slides] [BibTeX]
  • Lingxi Xie, Qi Tian and Bo Zhang, "Spatial Pooling of Heterogeneous Features for Image Classification", accepted to IEEE Transaction on Image Processing (TIP), 2014. [PDF] [BibTeX]

References

  • [Wang, CVPR10] Jinjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas Huang, and Yihong Gong, "Locality-Constrained Linear Coding for Image Classification", in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010.
  • [Lazebnik, CVPR06] Svetlana Lazebnik, Cordelia Schmid and Jean Ponce, "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories", in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2006.
  • [Lowe, IJCV04] David G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints", in Internatial Journal of Computer Vision (IJCV), 2004.
  • [Canny, PAMI86] John Canny, "A Computational Approach to Edge Detection", in IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 1986.