There are a lot of advantages of using image pyramids in digital image processing, and this article is about image pyramids.

**WHY PYRAMIDS?**

We all know about the two popular kinds of domain: Spatial Domain and Frequency Domain.

*Van Gogh self-portrait and the this image in Fourier domain*

Spatial domain is exactly what we see in everyday life, trees, cars, people’s faces, and so on. The spatial domain tells us where things are, and our eye can detect luminance value in every pixel by seeing an image, and of course, our brain is familiar with this kind of image. However, image in spatial domain tells us nothing about what these things are, I mean, what is deep in the image.

Frequency domain is another representation of image, it shows the frequency of the image. In which, low frequency represents shape and smooth details, while high frequency represents sharper details like corners. We can know a lot of details about the image by analyzing the image in frequency domain, but in this domain, we have no idea about the objects’ location in image.

That’s why we need image pyramids, actually, image pyramids represents the spatial look of an image in different frequency.

**TYPES OF IMAGE PYRAMIDS**

There are kinds of image pyramids, including Gaussian pyramid, Laplacian pyramid, Wavelet/QMF, Steerable pyramid et al, and in this article, I’ll introduce Gaussian and Laplacian pyramids, Wavelet and steerable are long stories, and I’ll introduce them in future articles.

1. Gaussian pyramid

What Gaussian pyramid do is nothing but repeat filtering and subsampling,

In one-dimensional situation, we can see how Gaussian pyramid works in the above picture. The value of each node in the bottom (zero) level is just the gray level of a corresponding image pixel; the value of each node in a high level is the weighted average of node values in the next lower level.

In 2D situation, when scale factor is 2, the total number of pixels in pyramid is:

1 + 1/4 + 1/16 + 1/32 …… = 4/3 the size of the original image.

In OpenCV, we can easily create Gaussian pyramid by using:

pyrDown(src, dst);

And what OpenCV do when produce layer i+1 in Gaussian pyramid, is something like:

- Convolve Gi with a Gaussian kernel:

- Remove every even-numbered row and column.

#include "opencv2/imgproc/imgproc.hpp" #include "opencv2/highgui/highgui.hpp" #include <vector> using namespace cv; using namespace std; int main(){ Mat image = imread("vangogh.jpg"); int level =5; vector <Mat> GaussianPyramid; vector <Mat> LaplacianPyramid; Mat temp1, temp2, temp3; Mat Lap; image.copyTo(temp1); for(int i=0; i<level; i++){ pyrDown(temp1, temp2); pyrUp(temp2, temp3, temp1.size()); Lap = temp1-temp3; GaussianPyramid.push_back(temp2); LaplacianPyramid.push_back(Lap); temp1=temp2; } //show whatever you want. waitKey(0); return 0; }

**WHAT CAN WE DO USING IMAGE PYRAMIDS?**

- Build Laplacian pyramids LA and LB from images A and B.
- Build a Gaussian pyramid GR from selected region R.
- Form a combined pyramid LS from LA and LB using nodes of GR as weights:

LS(i,j) = GR(I,j,)*LA(I,j) + (1-GR(I,j))*LB(I,j)

- Collapse the LS pyramid to get the final blended image.

Pingback: SIFT, track, OpenCV, Image Processing()