SURF (Speeded Up Robust Features), is a feature detector, we talked about SIFT before, and SURF is sort of derivative of SIFT. SURF is based on sums of 2D Haar wavelet responses and makes an efficient use of integral images.

I’ll not represent the whole story of SURF, because its idea is very similar to SIFT, so I’ll only talk about the difference between these two methods.


In Sift method, we use Difference of Gaussian (DoG) to build the image pyramid, and in Surf, we simply use an integer approximation to the determinant of Hessian blob detector .

Given a pixel, the Hessian of this pixel is something like:



For adapt to any scale, we filtered the image by a Gaussian kernel, so given a point X = (x, y), the Hessian matrix H(x, σ) in x at scale σ is defined as:


where Lxx(x, σ) is the convolution of the Gaussian second order derivative with the image I in point x, and similarly for Lxy (x, σ) and Lyy (x, σ).

First convolution, then second order derivative, we now approximate these two processes with one single filter.

2012081723440134 2012081723434916

These approximate second order Gaussian derivatives and can be evaluated at a very low computational cost using integral images, and this is part of the reason why SURF is fast.

Now we can represent the determinant of the Hessian (approximated) as:


and we can use 0.9 for w by Bay’s suggestion.


In Sift, we use DOG to build image pyramids, the pyramid have several octaves, and there are several images layers in each octave. The difference between Sift pyramid and Surf pyramid is, in Sift, we use different scales of image; and in Surf, we use different scales of Gaussian masks, while the scale of image is always unaltered. By this, we save a lot of time by not downsampling image.


 Instead of iteratively reducing the image size (left), the use of integral images allows the up-scaling of the filter at constant cost (right).


In Sift, we use an orientation histogram, and find the largest orientation value and also those values that are over 80% of the largest, and use these orientations as the main orientation of the feature descriptor. In Surf, we use the sum of the Haar wavelet response around the point of interest.


We first calculate the Haar wavelet responses in x and y direction within a circular neighborhood of radius 6s around the interest point, with s the scale at which the interest point was detected. We calculate the sum of vertical and horizontal wavelet responses in a scanning aria, then change the scanning orientation (add pi/3), and re-calculate, until we find the orientation with largest sum value, this orientation is the main orientation of feature descriptor.

Now it’s time to extract the descriptor. First we construct a square region centered around the feature point, and oriented along the main orientation we already got above, the size of this window is 20s,s is the scale at which the interest point was detected. Second we split this region up regularly into smaller 4*4 square sub-regions, for each sub-region, we compute Harr wavelet responses at 5*5 regularly spaced sample points.


We extract the sum of values of the responses in both x and y orientation, furthermore, we extract the sum of the absolute values of the responses, hence, each sub-region has a 4-D descriptor vector v. Concatenating this for all 4*4 sub-regions, our final descriptor is a 64-D vector. (In Sift, our descriptor is 128-D vector, so this is part of the reason that SURF is faster than Sift.)


It is easy to call functions about Surf in OpenCV, this is a simple example:

    Mat img_1, img_2;
    Mat img_1c = imread ("corner0001.JPG");
    Mat img_2c = imread ("corner0002.JPG");
    cvtColor (img_1c,img_1,CV_BGR2GRAY);
    cvtColor (img_2c,img_2,CV_BGR2GRAY);

    vector<KeyPoint> keypoints_1;
    vector<KeyPoint> keypoints_2;
    SurfFeatureDetector surf(2.50e3);
    surf(img_1, Mat(), keypoints_1);
    surf(img_2, Mat(), keypoints_2);

    SurfDescriptorExtractor extractor;
    cv::Mat descriptors_1,descriptors_2;
    //compute descriptors

    //use burte force method to match vectors
    BruteForceMatcher<L2<float> >matcher;

    //draw results
    Mat img_matches;

And result is good:


Now we can use methods like RANSAC to eliminate the bad matches…

In my 3-D reconstruction project, I decided to use SURF instead of SIFT because of its higher speed.



This entry was posted in Algorithm, OpenCV and tagged , , , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>