Application of Analytics in Machine Vision using Big Data

Noha Elfiky

Abstract


The Bag-of-Words (BoW) approach has been successfully applied in the context of category-level image classification. To incorporate spatial image information in the BoW model, Spatial Pyramids (SPs) are used. However, spatial pyramids are rigid in nature and are based on pre-defined grid configurations. As a consequence, they often fail to coincide with the underlying spatial structure of images from different categories which may negatively affect the classification accuracy.

The aim of the paper is to use the 3D scene geometry to steer the layout of spatial pyramids for category-level image classification (object recognition). The proposed approach provides an image representation by inferring the constituent geometrical parts of a scene. As a result, the image representation retains the descriptive spatial information to yield a structural description of the image. From large scale experiments on the Pascal VOC2007 and Caltech101, it can be derived that SPs which are obtained by the proposed Generic SPs outperforms the standard SPs.


Keywords


Big Data Analytics, Machine Vision, Image Classification and Object Recognition Tasks, Bag of Words, Spatial Pyramids

Full Text:

PDF

References


Dance, L. Fan, J. Willamowski, C. Bray.,Visual categorization with bags of keypoints., in: ECCV Workshop on Statistical Learning in Computer Vision.,2004.

K. Mikolajczyk, C. Schmid., A performance evaluation of local descriptors, TPAMI 27 (10) (2005) 1615-1630.

L. Fei-Fei, P. Perona., A bayesian hierarchical model for learning natural scene categories, in: CVPR, 2005.

J. Zhang, M. Marszalek, S. Lazebnik, C. Schmid, Local features and kernels for classification of texture and object categories: An in-depth study. A comprehensive study, IJCV 73 (2) (2007) 213–218.

S. Lazebnik, C. Schmid, J. Ponce, Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories, in: CVPR, 2006.

M. Everingham, L. V. Gool, C. K. I.Williams, J.Winn,A. Zisserman, The pascal visual object classes challenge 2007 results. (2007).

V. Nedovic, A. W. M. Smeulders, A. Redert, J.-M. Geusebroek.,Stages as models of scene geometry., IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 32 (9) (2010) 1673–1687.

N. Slonim, N. Tishby, Agglomerative information bottleneck,in: NIPS, 1999.

B. Fulkerson, A. Vedaldi, S. Soatto, Localizing objects with smart dictionaries, in: ECCV, 2008.

L. Fei-Fei, R. Fergus, P. Perona., Learning generative visual models from few training examples., in: CVPR Workshop GMBV, 2004.

D. Hoiem, A. A. Efros, M. Hebert., Geometric context from a single image., in: ICCV, 2005, pp. 654–661.

E. Delage, H. Lee, A. Y. Ng., A dynamic bayesian network model for autonomous 3d reconstruction from a single indoor image., in: CVPR, 2006, pp. 2418–2428.

E. Sudderth,Torralba, W. Freeman, A. Willsky., Depth from familiar objects: A hierarchical model for 3d scenes.,in: CVPR, 2006., pp. 2410–2417.

V. Nedovic, A. Smeulders, A. Redert, J.-M. Geusebroek., Depth information by stage classification., in: ICCV, 2007.

M. Marszalek, C. Schmid, H. Harzallah, J. van de Weijer, Learning object representation for visual object class recognition, in: Visual recognition Challenge Workshop, ICCV, 2007.

J. van Gemert, Exploiting photographic style for category-level image classification by generalizing the spatial pyramid., in: ICMR, 2011.

K. Grauman, T. Darrell., The pyramid match kernel: Discriminative classification with sets of image features., in: ICCV, 2005.

L. Rui, A. Gijsenij, T. Gevers, V. Nedovic, X. De, J. Geusebroek., Color constancy using 3d scene geometry., in: ICCV, 2009.

A. P. Moore, S. J. D. Prince, J. Warrell, U. Mohammed, G. Jones, Superpixel lattices, in: Conference on Computer Vision and Pattern Recognition (CVPR), 2008.

D. Lowe., Distinctive image features from scale invariant keypoints, IJCV 60 (2) (2004) 91–110.

P. V. Gehler, S. Nowozin., On feature combination for multiclass object classification, in: ICCV, 2009.

J. Xiao, J. Hays, K. Ehinger, A. Oliva, A. Torralba, Largescale scene recognition from abbey to zoo, in: CVPR, 2010.

F. Khan, J. van de weijer, M. Vanrell, Top-down color attention for object recognition, in: ICCV, 2009.

Y. Su, F. Jurie, Visual word disambiguation by semantic contexts, in: ICCV, 2011.

G. Sharma, F.Jurie, Learning discriminative spatial representation for image classification, in: British Machine Vision Conference (BMVC), 2011.

Y. Boureau, F. Bach, Y. LeCun, J. Ponce., Learning midlevel features for recognition., in: CVPR, 2010.

H. Zhang, A. C. Berg, M. Maire, J. Malik., Svm-knn: Discriminative nearest neighbor classification for visual category recognition., in: CVPR, 2006.

O. Boiman, I. Rehovot, E. Shechtman, M. Irani., In defense of nearest-neighbor based image classification., in: CVPR, 2008.

J. Yang, K. Yu, Y. Gong, T. Huang, Linear spatial pyramid matching using sparse coding for image classification, in: CVPR, 2009.

K. E. A. van de Sande, T. Gevers, C. G. M. Snoek, Evaluating color descriptors for object and scene recognition, TPAMI 32 (9) (2010) 1582–1596.




DOI: https://doi.org/10.24203/ajas.v7i4.5910

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.