Wednesday, January 19, 2011

SURF Detector

FAST-Hessian Detector + SURF Descriptor

Key Point Detection
  • Build Integral Image
  • Uses Pyramid of Filter (not image) to approximate Laplace of Gaussian (LoG), supposedly run faster than SIFT, which uses DoG for approximation.
  • Filters will span several octaves and with fixed number of scales in each, similar to SIFT. This type of process is Scale-Space Analysis.
  • Integral image helps to keep the running speed constant as it is insensitive to increasing filter sizes.
  • Uses discrete Box Filter to calculate Hessian determinant. Box size in multiples of its current scale.
  • All filters run on the the original image instead of iterative like SIFT, allowing parallel execution.
  • Feature points are maxima of the determinants in the adjacent scale and points (3x3x3), similar to SIFT.
Orientation Assignment
  • The responses from 2 first-order Haar wavelet filters (1, -1), in dx and dy orientations, are collected on each feature point. The responses are put on a 2D plane as vectors [dx, dy].
  • Again, the Integral Image would help with this box filter.
  • An orientation-window of 60-degree-angle is slided around the origin at x-y plane.  Each angle will have a corresponding sum of magnitudes for every vectors inside that window.
  • The dominant orientation would be the angle of the largest sum.
Descriptor Extraction
  • The descriptor is 64-element vector, half the size of SIFT. Each represents the intensity property of a 4x4 square within the "interest region". The intensity property is 4-element tuple [ Sum(dx), Sum(abs(dx)),  Sum(dy),  Sum(abs(dy)) ]. The Sum is calculated from the Haar wavelet response of a 5x5 sample window.
  • 4x4x4 = 64
  • Implying that there are more than 1 4x4 square at each key-point location. I suppose the size of the region is related to the associated scale.
Matching
  • The sign (+ive/-ive) of the Trace of Laplacian (Hessian Matrix) is used in matching phase, speeding up the process.
Code
  • It is credited to Liu Liu with modifications by Ian Mahon. The authors state the current gotchas and gives a good overview of the code at the file (surf.cpp) header.
  • The code did try to exploit parallelism using OpenMP (cv::parallel_for).
  • Define some parameter structure and default values similar SIFT in the future (start with cvSURFParams?).
  • The 'extended' parameter changes the descriptor size from 64 to 128. It is default to '1' in the code, but it has default argument value of 'false' at the SURF::SURF(...) constructor.
  • Why not integrate OpenSURF by Chris Evans? (With hyperlink to CMU CV site with Image Databases and other CV resources).
Sample
  • With default parameters, SIFT takes 3-4 seconds and detect 872 points on Kobe Bryants picture. SURF takes less than 1 second to detect 413 points.
  • SURF is able to handle the Obama picture at its original resolution, unlike SIFT. Takes 11 seconds to detect 2690 points.
  • Decrease/Increase the number of detected points by increasing/lowering the threshold argument. Only the points with Hessian response higher than thresholds are considered.

Commercial Application (academic spin-off)

Readings
  • Speeded-Up Robust Features (SURF). Herbert Bay et al.
  • CPSC 643 Presentation of SURF, Herbert Bay et al.
  • Non-Maxima Suppression Demo
  • Wavelets in Multiresolution Analysis by Tom Germano
  • A Tutorial on Wavelets and their Applications by Martin J. Mohlenkamp at the University of Colorado, Boulder
  • WAVELETS FOR KIDS - A Tutorial Introduction by B. Vidakovic, P. Mueller at Duke University.
  • Wavelets: A Tutorial by Zhuming Lam (Apr-2002)
Repeatability
  • The term 'repeatability' (seen on SIFT and SURF papers) is a measure of the ability to detect the same set of key-point from various viewpoints. For example, the rotation of Van Gogh image (Fig 3 of the SURF paper by Bay et al.).
  • Origin of this term: See "Evaluation of Interest Point Detectors" paper by Shmid et al

No comments:

Post a Comment