Computer Vision Metrics: Survey, Taxonomy, and Analysis is published by Apress in May 2014. This book has 508 pages in English, ISBN-13 978-1430259299.
Computer Vision Metrics provides an extensive survey and analysis of over 100 current and historical feature description and machine vision methods, with a detailed taxonomy for local, regional and global features. This book provides necessary background to develop intuition about why interest point detectors and feature descriptors actually work, how they are designed, with observations about tuning the methods for achieving robustness and invariance targets for specific applications. The survey is broader than it is deep, with over 540 references provided to dig deeper. The taxonomy includes search methods, spectra components, descriptor representation, shape, distance functions, accuracy, efficiency, robustness and invariance attributes, and more. Rather than providing ‘how-to’ source code examples and shortcuts, this book provides a counterpoint discussion to the many fine opencv community source code resources available for hands-on practitioners.
What you’ll learn
- Interest point & descriptor concepts (interest points, corners, ridges, blobs, contours, edges, maxima), interest point tuning and culling, interest point methods (Laplacian, LOG, Moravic, Harris, Harris-Stephens, Shi-Tomasi, Hessian, difference of Gaussians, salient regions, MSER, SUSAN, FAST, FASTER, AGHAST, local curvature, morphological regions, and more), descriptor concepts (shape, sampling pattern, spectra, gradients, binary patterns, basis features), feature descriptor families.
- Local binary descriptors (LBP, LTP, FREAK, ORB, BRISK, BRIEF, CENSUS, and more).
- Gradient descriptors (SIFT, SIFT-PCA, SIFT-SIFER, SIFT-GLOH, Root SIFT, CensureE, STAR, HOG, PHOG, DAISY, O-DAISY, CARD, RFM, RIFF-CHOG, LGP, and more).
- Shape descriptors (Image moments, area, perimeter, centroid, D-NETS, chain codes, Fourier descriptors, wavelets, and more) texture descriptors, structural and statistical (Harallick, SDM, extended SDM, edge metrics, Laws metrics, RILBP, and more).
- 3D descriptors for depth-based, volumetric, and activity recognition spatio-temporal data sets (3D HOG, HON 4D, 3D SIFT, LBP-TOP, VLBP, and more).
- Basis space descriptors (Zernike moments, KL, SLANT, steerable filter basis sets, sparse coding, codebooks, descriptor vocabularies, and more), HAAR methods (SURF, USURF, MUSURF, GSURF, Viola Jones, and more), descriptor-based image reconstruction.
- Distance functions (Euclidean, SAD, SSD, correlation, Hellinger, Manhattan, Chebyshev, EMD, Wasserstein, Mahalanobis, Bray-Curtis, Canberra, L0, Hamming, Jaccard), coordinate spaces, robustness and invariance criteria.
- Image formation, includes CCD and CMOS sensors for 2D and 3D imaging, sensor processing topics, with a survey identifying over fourteen (14) 3D depth sensing methods, with emphasis on stereo, MVS, and structured light.
- Image pre-processing methods, examples are provided targeting specific feature descriptor families (point, line and area methods, basis space methods), colorimetry (CIE, HSV, RGB, CAM02, gamut mapping, and more).
- Ground truth data, some best-practices and examples are provided, with a survey of real and synthetic datasets.
- Vision pipeline optimizations, mapping algorithms to compute resources (CPU, GPU, DSP, and more), hypothetical high-level vision pipeline examples (face recognition, object recognition, image classification, augmented reality), optimization alternatives with consideration for performance and power to make effective use of SIMD, VLIW, kernels, threads, parallel languages, memory, and more.
- Synthetic interest point alphabet analysis against 10 common opencv detectors to develop intuition about how different classes of detectors actually work (SIFT, SURF, BRISK, FAST, HARRIS, GFFT, MSER, ORB, STAR, SIMPLEBLOB). Source code provided online.
- Visual learning concepts, although not the focus of this book, a light introduction is provided to machine learning and statistical learning topics, such as convolutional networks, neural networks, classification and training, clustering and error minimization methods (SVM,’s, kernel machines, KNN, RANSAC, HMM, GMM, LM, and more). Ample references are provided to dig deeper.
Who this book is for
Engineers, scientists, and academic researchers in areas including media processing, computational photography, video analytics, scene understanding, machine vision, face recognition, gesture recognition, pattern recognition and general object analysis.