OpenCV Adventure: SURF Detector

FAST-Hessian Detector + SURF Descriptor

Key Point Detection

Build Integral Image
Uses Pyramid of Filter (not image) to approximate Laplace of Gaussian (LoG), supposedly run faster than SIFT, which uses DoG for approximation.
Filters will span several octaves and with fixed number of scales in each, similar to SIFT. This type of process is Scale-Space Analysis.
Integral image helps to keep the running speed constant as it is insensitive to increasing filter sizes.
Uses discrete Box Filter to calculate Hessian determinant. Box size in multiples of its current scale.
All filters run on the the original image instead of iterative like SIFT, allowing parallel execution.
Feature points are maxima of the determinants in the adjacent scale and points (3x3x3), similar to SIFT.

Orientation Assignment

The responses from 2 first-order Haar wavelet filters (1, -1), in dx and dy orientations, are collected on each feature point. The responses are put on a 2D plane as vectors [dx, dy].
Again, the Integral Image would help with this box filter.
An orientation-window of 60-degree-angle is slided around the origin at x-y plane. Each angle will have a corresponding sum of magnitudes for every vectors inside that window.
The dominant orientation would be the angle of the largest sum.

Descriptor Extraction

The descriptor is 64-element vector, half the size of SIFT. Each represents the intensity property of a 4x4 square within the "interest region". The intensity property is 4-element tuple [ Sum(dx), Sum(abs(dx)), Sum(dy), Sum(abs(dy)) ]. The Sum is calculated from the Haar wavelet response of a 5x5 sample window.
4x4x4 = 64
Implying that there are more than 1 4x4 square at each key-point location. I suppose the size of the region is related to the associated scale.

Matching

The sign (+ive/-ive) of the Trace of Laplacian (Hessian Matrix) is used in matching phase, speeding up the process.

Code

It is credited to Liu Liu with modifications by Ian Mahon. The authors state the current gotchas and gives a good overview of the code at the file (surf.cpp) header.
The code did try to exploit parallelism using OpenMP (cv::parallel_for).
Define some parameter structure and default values similar SIFT in the future (start with cvSURFParams?).
The 'extended' parameter changes the descriptor size from 64 to 128. It is default to '1' in the code, but it has default argument value of 'false' at the SURF::SURF(...) constructor.
Why not integrate OpenSURF by Chris Evans? (With hyperlink to CMU CV site with Image Databases and other CV resources).

Sample

With default parameters, SIFT takes 3-4 seconds and detect 872 points on Kobe Bryants picture. SURF takes less than 1 second to detect 413 points.
SURF is able to handle the Obama picture at its original resolution, unlike SIFT. Takes 11 seconds to detect 2690 points.
Decrease/Increase the number of detected points by increasing/lowering the threshold argument. Only the points with Hessian response higher than thresholds are considered.

Commercial Application (academic spin-off)

Readings

Speeded-Up Robust Features (SURF). Herbert Bay et al.
CPSC 643 Presentation of SURF, Herbert Bay et al.
Non-Maxima Suppression Demo
Wavelets in Multiresolution Analysis by Tom Germano
A Tutorial on Wavelets and their Applications by Martin J. Mohlenkamp at the University of Colorado, Boulder
WAVELETS FOR KIDS - A Tutorial Introduction by B. Vidakovic, P. Mueller at Duke University.
Wavelets: A Tutorial by Zhuming Lam (Apr-2002)

Repeatability

The term 'repeatability' (seen on SIFT and SURF papers) is a measure of the ability to detect the same set of key-point from various viewpoints. For example, the rotation of Van Gogh image (Fig 3 of the SURF paper by Bay et al.).
Origin of this term: See "Evaluation of Interest Point Detectors" paper by Shmid et al

OpenCV Adventure