We are always interested in finding the movement of objects from videos, optical flow is one of the most famous methods to do this. Optical flow has lots of uses, such as tracking object, camera correction, mosaics and so on.
All optical flow methods are based on the following assumptions:
- Color constancy (brightness constancy for single-channel images);
- Small motion.
With these assumptions, if we have two images (say two adjacent frames of a video), what we need to do is simply find pixel correspondences between the two images. Because of color constancy, we don’t need to consider the RGB value’s change between two images, and because of small motion, we can find the corresponding point of a pixel within a very little neighborhood.
Continue reading “Coarse-to-fine Optical Flow (Lucas & Kanade)” »
This post is about a new cluster algorithm published by Alex Rodriguez and Alessandro Laio in the latest Science magazine. The method is short and efficient, I implemented it using about only 100 lines of cpp code.
There are two leading criteria in this method: Local Density and Minimum Distance with higher density.
Rho above is the local density, in which,
Continue reading “Clustering by fast search and find of density peaks” »
I chose “Dropped out auto-encoder” as my final project topic in the last semester deep learning course, it was simply dropping out units in regular sparse auto-encoder, and furthermore, in stacked sparse auto-encoder, both in visible layer and hidden layer. It does not work well on auto-encoders, except can be used in fine-tune process of stacked sparse auto-encoder. I think the right thing to do is using denoising auto-encoder, instead.
Denoising auto-encoder was raised by Pascal Vincent et al, the basic idea is to force the hidden layer to discover more robust features and prevent it from simply learning the identity, by training the auto-encoder to reconstruct the input from a corrupted version of it.
Continue reading “Denoising Autoencoder” »
The CIFAR-10 dataset can be found HERE.
It is a very popular multi-channel image dataset for classifier training, as a simplify version of CIFAR-100, it is easier to use for newbies.
Here’s a C++ version code for reading this dataset from .bin files into OpenCV matrices. Continue reading “C++ Code For Reading CIFAR-10 Dataset” »
Since the last CNN post, I was working on a new version of CNN, which support multi-layers Conv and Pooling process, I’d like to share some experience here.
VECTOR VS HASH TABLE
You can see in the last post, I used vector of Mat in convolution steps, it works well when we only have one convolution layer, which means for each input image, we can get 1 * KernelAmount of images after the Conv and Pooling layer (the Pooling operation doesn’t change the amount of images). For easily retrieve these “conved images”, I generate one vector of Mat for each input image. Continue reading “Convolutional Neural Networks II” »
WHAT IS CNN
A Convolutional Neural Network (CNN) is comprised of one or more convolutional layers, pooling layers and then followed by one or more fully connected layers as in a standard neural network. The architecture of a CNN is designed to take advantage of the 2D structure of an input image (or other 2D input such as a speech signal). This is achieved with local connections and tied weights followed by some form of pooling which results in translation invariant features. Another benefit of CNNs is that they are easier to train and have many fewer parameters than fully connected networks with the same number of hidden units.
It was invented by Prof. Yann LeCun (NYU), and the effect of CNN to the world is profound, every big IT company is trying to do something using it. Continue reading “Convolutional Neural Networks” »
This is the early version of my CNN, at that time, I incorrectly thought that I can just use some randomly chosen Gabor filters to do the convolution, so I wrote this. Actually, the test result is not bad for simple datasets such as MNIST, I think it’s just a fake CNN, but a nice deep network, which convolves the images with randomly chosen Gabor filters and pooling, then train use regular deep network. The convolution and pooling parts can be seen as kind of pre-processing. Continue reading “A Fake Convolutional Neural Network” »
During this spring break, I worked on building a simple deep network, which has two parts, sparse autoencoder and softmax regression. The method is exactly the same as the “Building Deep Networks for Classification” part in UFLDL tutorial. For better understanding it, I re-implemented it using C++ and OpenCV.
- Read dataset (including training data and testing data) into cv::Mat.
- Pre-processing data (size normalization, random order, zero mean etc.), this is for accelerate the training speed.
- Implement function which calculating sparse autoencoder cost and gradients.
- Implement function which calculating softmax regression cost and gradients.
- Implement function which calculating the whole network’s cost and gradients.
- Using gradient checking method to check whether the above functions work correctly.
- Train sparse autoencoder layer by layer (for example, say we want 3 sparse autoencoder layers. First we train 1st layer using training data as both input and output, after that, we get the hidden layer activation using the trained weights and biases. The activation of first layer sparse autoencoder is as both input and output of 2nd layer sparse autoencoder. And similarly, the activation of 2nd layer is as both input and output of 3rd layer, this is why I said train this part layer by layer.)
- Train softmax regression. In this part, the input is the last layer autoencoder’s activation, and output is the Y part of training dataset.
- Fine-Tune the whole network using back propagation method.
- Now we got the trained network, we can test it.
Continue reading “A Simple Deep Network” »
Posted in Algorithm, Machine Learning, OpenCV
Tagged C++, Deep Learning, fine-tune, Machine Learning, MNIST, OpenCV, Softmax, Sparse Autoencoder, UFLDL
I’m learning Prof. Andrew Ng’s Unsupervised Feature Learning and Deep Learning tutorial, This is the 8th exercise, which is a simple ConvNet with Pooling process. I’ll not go through the detail of the material. More details about this exercise can be found HERE.
I’ll try to implement it using C++ and OpenCV if I have time next week. Continue reading “[UFLDL Exercise] Convolution and Pooling” »