Part and Attribute Discovery from Relative Annotations

Maji, Subhransu; Shakhnarovich, Gregory

doi:10.1007/s11263-014-0716-6

Part and Attribute Discovery from Relative Annotations

Published: 26 April 2014

Volume 108, pages 82–96, (2014)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Subhransu Maji¹ &
Gregory Shakhnarovich¹

832 Accesses
6 Citations
Explore all metrics

Abstract

Part and attribute based representations are widely used to support high-level search and retrieval applications. However, learning computer vision models for automatically extracting these from images requires significant effort in the form of part and attribute labels and annotations. We propose an annotation framework based on comparisons between pairs of instances within a set, which aims to reduce the overhead in manually specifying the set of part and attribute labels. Our comparisons are based on intuitive properties such as correspondences and differences, which are applicable to a wide range of categories. Moreover, they require few category specific instructions and lead to simple annotation interfaces compared to traditional approaches. On a number of visual categories we show that our framework can use noisy annotations collected via “crowdsourcing” to discover semantic parts useful for detection and parsing, as well as attributes suitable for fine-grained recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

http://en.wikipedia.org/wiki/Florida_Scrub_Jay.

References

Agarwal, A., & Triggs, B. (2006). Recovering 3d human pose from monocular images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(1), 44–58.
Article Google Scholar
Berg, T., Berg, A., & Shih, J. (2010). Automatic attribute discovery and characterization from noisy web data. In European Conference on Computer Vision.
Blei, D. M., & Jordan, M. I. (2003). Modeling annotated data. In SIGIR (pp. 127–134).
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.
Bourdev, L., Maji, S., Brox, T., & Malik, J. (2010). Detecting people using mutually consistent poselet activations. In European Conference on Computer Vision.
Bourdev, L., Maji, S., & Malik, J. (2011). Describing people: A poselet-based approach to attribute classication. In International Conference on Computer Vision.
Bourdev, L., & Malik, J. (2009). Poselets: Body part detectors trained using 3d human pose annotations. In International Conference on Computer Vision.
Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P., & Belongie, S. (2010). Visual recognition with humans in the loop. In K. Daniilidis, P. Maragos & N. Paragios (Eds.), Computer vision-ECCV 2010 (pp. 438–451). Berlin: Springer.
Brown, P. F., Cocke, J., Pietra, S. A. D., Pietra, V. J. D., Jelinek, F., Lafferty, J. D., et al. (1990). A statistical approach to machine translation. Computational Linguistics, 16, 79–85.
Bush, V. (1945). The atlantic monthly. As we may think.
Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., & Vedaldi, A. (2014). Describing textures in the wild. In Computer Vision and Pattern Recognition (CVPR).
Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In N. Dalal & B. Triggs (Eds.), Computer Vision and Pattern Recognition (pp. 886–893).
Desai, C., & Ramanan, D. (2012). Detecting actions, poses, and objects with relational phraselets. In Computer vision-ECCV 2012 (pp. 158–172). Berlin: Springer.
Duan, K., Parikh, D., Crandall, D., & Grauman, K. (2012). Discovering localized attributes for fine-grained recognition. In 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3474–3481).
Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2), 303–338.
Article Google Scholar
Farhadi, A., Endres, I., & Hoiem, D. (2010). Attribute-centric recognition for cross-category generalization. In Computer Vision and Pattern Recognition.
Felzenszwalb, P., Girshick, R., McAllester, D., & Ramanan, D. (2010). Object detection with discriminatively trained part-based models. IEEE Transaction of Pattern Analysis and Machine Intelligence, 32(9), 1627–1645.
Article Google Scholar
Felzenszwalb, P. F., & Huttenlocher, D. P. (2005). Pictorial structures for object recognition. International Journal of Computer Vision, 61, 55–79.
Article Google Scholar
Ferrari, V., Marin-Jimenez, M., & Zisserman, A. (2008). Progressive search space reduction for human pose estimation. In Computer Vision and Pattern Recognition.
Frome, A., Singer, Y., & Malik, J. (2007). Image retrieval and classification using local distance functions. In Advances in neural information processing systems 19: Proceedings of the 2006 conference (Vol. 19, p. 417). MIT Press.
Girshick, R. B., Felzenszwalb, P. F., & McAllester, D. (2012) Discriminatively trained deformable part models, release 5. http://people.cs.uchicago.edu/rbg/latent-release5/.
Hariharan, B., Malik, J., & Ramanan, D. (2012). Discriminative decorrelation for clustering and classification. In A. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato & C. Schmid (Eds.), Computer vision-ECCV 2012 (pp. 459–472). Berlin: Springer.
Itti, L., & Koch, C. (2001). Computational modelling of visual attention. Nature Reviews Neuroscience, 2(3), 194–203.
Article Google Scholar
Joachims, T. (2002). Optimizing search engines using clickthrough data. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 133–142). ACM.
Kovashka, A., Parikh, D., & Grauman, K. (2012). Whittlesearch: Image search with relative attribute feedback. In 2012 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2973–2980). IEEE.
Kumar, N., Belhumeur, P., & Nayar, S. (2008). Facetracer: A search engine for large collections of images with faces. In European conference on computer vision.
Leibe, B., Leonardis, A., & Schiele, B. (2004). Combined object categorization and segmentation with an implicit shape model. In ECCV workshop on statistical learning in computer vision (pp. 17–32).
Maji, S. (2011). Large scale image annotations on amazon mechanical turk. Tech. Rep. UCB/EECS-2011-79, EECS Department, University of California, Berkeley (2011). http://www.eecs.berkeley.edu/Pubs/TechRpts/2011/EECS-2011-79.html
Maji, S. (2012). Discovering a lexicon of parts and attributes. In Second International Workshop on Parts and Attributes, ECCV.
Maji, S., & Shakhanarovich, G. (2013). Part discovery from partial correspondence. In Computer vision and pattern recognition.
Maji, S., & Shakhnarovich, G. (2012). Part annotations via pairwise correspondence. In Human computation workshops at the AAAI.
Malisiewicz, T., & Efros, A. (2009). Beyond categories: The visual memex model for reasoning about object relationships. In Advances in neural information processing systems (pp. 1222–1230).
Malisiewicz, T., Gupta, A., & Efros, A. A. (2011). Ensemble of exemplar-svms for object detection and beyond. In International conference on computer vision.
Parikh, D., & Grauman, K. (2011). Interactive discovery of task-specic nameable attributes. In Workshop on fine-grained visual categorization, CVPR.
Patterson, G., & Hays, J. (2012). Sun attribute database: Discovering, annotating, and recognizing scene attributes. In 2012 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2751–2758). IEEE.
Singh, S., Gupta, A., & Efros, A. A. (2012). Unsupervised discovery of mid-level discriminative patches. In A. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato & C. Schmid (Eds.), Computer vision-ECCV 2012 (pp. 73–86). Berlin: Springer.
Tamura, H., Mori, S., & Yamawaki, T. (1978). Textural features corresponding to visual perception. IEEE Transactions on Systems, Man and Cybernetics, 8(6), 460–473.
Article Google Scholar
Tamuz, O., Liu, C., Belongie, S., Shamir, O., & Kalai, A. (2011). Adaptively learning the crowd kernel. In International conference on machine learning (ICML). Bellevue, WA.
Von Ahn, L., & Dabbish, L. (2004). Labeling images with a computer game. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 319–326). ACM.
Von Ahn, L., Liu, R., & Blum, M. (2006). Peekaboom: A game for locating objects in images. In Proceedings of the SIGCHI conference on Human Factors in computing systems (pp. 55–64). ACM.
Weber, M., Welling, M., & Perona, P. (2000). Towards automatic discovery of object categories. In Computer vision and pattern recognition.
Welinder, P., Branson, S., Mita, T., Wah, C., Schroff, F., Belongie, S., & Perona, P. (2010). Caltech-UCSD birds 200. Tech. Rep. CNS-TR-2010-001, California Institute of Technology.
Xiao, J., Hays, J., Ehinger, K. A., Oliva, A., & Torralba, A. (2010). Sun database: Large-scale scene recognition from abbey to zoo. In 2010 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3485–3492). IEEE.
Yang, Y., & Ramanan, D. (2011). Articulated pose estimation with flexible mixtures-of-parts. In 2011 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1385–1392). IEEE.
Zhu, X., & Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE conference on computer vision and pattern recognition (CVPR) (pp. 2879–2886). IEEE.

Download references

Acknowledgments

Part of the work was done by SM during a workshop (http://www.clsp.jhu.edu/workshops/archive/ws-12/groups/tduosn/) at the CLSP, Johns Hopkins University.

Author information

Authors and Affiliations

Toyota Technological Institute at Chicago, 6045 S. Kenwood Ave, Chicago, IL , 60637, USA
Subhransu Maji & Gregory Shakhnarovich

Authors

Subhransu Maji
View author publications
You can also search for this author in PubMed Google Scholar
Gregory Shakhnarovich
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Subhransu Maji.

Additional information

Communicated by Serge Belongie and Kristen Grauman.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Maji, S., Shakhnarovich, G. Part and Attribute Discovery from Relative Annotations. Int J Comput Vis 108, 82–96 (2014). https://doi.org/10.1007/s11263-014-0716-6

Download citation

Received: 25 February 2013
Accepted: 14 March 2014
Published: 26 April 2014
Issue Date: May 2014
DOI: https://doi.org/10.1007/s11263-014-0716-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Part and Attribute Discovery from Relative Annotations

Abstract

Access this article

Similar content being viewed by others

A Taxonomy of Part and Attribute Discovery Techniques

The Open Images Dataset V4

COCO Attributes: Attributes for People, Animals, and Objects

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Part and Attribute Discovery from Relative Annotations

Abstract

Access this article

Similar content being viewed by others

A Taxonomy of Part and Attribute Discovery Techniques

The Open Images Dataset V4

COCO Attributes: Attributes for People, Animals, and Objects

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation