Restricted Boltzmann Machine

WHAT IS RBM

Restricted Boltzmann Machine is one of the special cases of Boltzmann Machine, which restricted all visible-visible connections and hidden-hidden connections, which makes for each hidden unit, it connects to all visible units, and for each visible unit, it connects to all hidden units. Following is a figure which shows the model of RBM.

111

RBM is kind of unsupervised learning method, it is often uses as the pre-training of large deep networks, it is also the basis of DBN.

There are tons of tutorials and blogs online that introduce the underlying math of RBM (energy-based model and so on), so I’m not going to make the effort on explain these things, instead, I’m going to talk about some details on implementing it.

CONTRASTIVE DIVERGENCE

In order to train an RBM, we’d like to minimize the average negative log-likelihood:

222

And when doing that by using stochastic gradient descent,

333 we found that the negative phase part is hard to compute, so we use a method called Contrastive Divergence to approximate it. 

The idea of CD method is replace the expectation by a point estimate at v(t), and we can get this v(t) by Gibbs sampling, v(t) is the node where the chain converges.

444

Here’s how this Gibbs sampling works (say the weight matrix between hidden layer and visible layer is W, the bias vector of hidden layer is B, the bias of visible layer is C):

  • Convert training data x into binary form, by build a v(0) matrix, each element of v(0) is randomly chosen to be 1 (versus 0) with probability of x (For correctly doing this, x should be pre-processed by normalization).
  • Now positive phase, calculate a positive-probability matrix which is Sigmoid(W’ * v(0) + B), then normalize this matrix.
  • Build hidden layer matrix h(0), each element of h(0) is randomly chosen to be 1 (versus 0) with positive-probability matrix we got in last step.
  • Now negative phase, calculate a negative-probability matrix which is Sigmoid(W * h(0) + C), then normalize this matrix.
  • Build new visible layer matrix v(1), each element of v(1) is randomly chosen to be 1 (versus 0) with negative-probability matrix we just got.
  • Repeat doing the above steps, with positive phase and negative phase in turn, until the chain converges.

Say the chain converges at v(t), then sum((v(0) – v(t)) ^ 2) is the error.

If we manually make the CD method iterates k times of Gibbs sampling, then this method is called CD-k. In general, the bigger k is, the less biased the estimate of the gradient will be. However, in practice, the CD method works pretty good even we let k equals 1, especially we train this RBM net for pre-training.

After we get this v(t), or x-tilde in following formulae, we can then update parameters W, B, and C:

555

The whole CD method is nothing but keep Gibbs sampling and parameters updating until stopping criteria.

CODE

I implemented single layer RBM using C++ and Armadillo.

https://github.com/xingdi-eric-yuan/single-layer-rbm 

TEST RESULT

(All results are shown by gif images, be patience please…)

I used the MNIST dataset for testing.

First I tried to train 50 different features for only digit 7.

7_100

Then I tried if I can see the difference between CD-1 trained weights and CD-5 trained weights (100 hidden size), here are the results:

CD-1 (left), CD-5 (right):

cd1_100   cd5_100

And I tried to learn a 500 hidden size net, here’s the result (this image is really big, about 20M).

test1

Enjoy it.

References

Hugo Larochelle‘s Neural networks class – Université de Sherbrooke on Youtube

This entry was posted in Algorithm, Machine Learning and tagged , , , . Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

6 Comments

  1. Daniel
    Posted February 6, 2015 at 10:16 am | Permalink

    Are there OpenCV alternatives to the armadillo code?

    acu? cv::accumulate?

    Rand? cv::Mat?

    • Eric
      Posted February 6, 2015 at 8:39 pm | Permalink

      Hi Daniel,

      For arma::accu, it accumulate (sum) all elements of matrix/vector, so you can use cv::sum to do the same thing, however, for 1-channel image, you should use cv::sum like:
      _sum = cv::sum(M)[0], since cv::sum calculates all the 3/4 channels of matrix.
      For randu/randn, what OpenCV API does is pretty much like what Armadillo does.
      For more details, you may want to check OpenCV Docs: http://docs.opencv.org/modules/core/doc/operations_on_arrays.html

  2. Posted March 21, 2015 at 3:38 pm | Permalink

    segmentation fault

    mat
    concatenateMat(vector &vec){
    int height = vec[0].n_rows; // segmentation fault here.

    i compiled with:

    g++ pro.cpp -o pro -O2 -larmadillo -std=c++11

    • Eric
      Posted March 23, 2015 at 5:05 am | Permalink

      Hi Keghn,

      I just tried to compile and run the code on my github (use cmake), there’s no such problem and runs perfectly, I think the problem is due to your armadillo version or c++11 maybe? Could you try to compile it use earlier version of c++? Thanks!

    • Miler
      Posted May 14, 2015 at 7:13 pm | Permalink

      I have the same problem.

      Could you fix it?

  3. Posted March 24, 2015 at 5:10 pm | Permalink

    $ cmake -version
    cmake version 2.8.12.2

    $ cd single-layer-rbm-master
    $ cmake .
    $ make

    — The C compiler identification is GNU 4.8.2
    — The CXX compiler identification is GNU 4.8.2
    — Check for working C compiler: /usr/bin/cc
    — Check for working C compiler: /usr/bin/cc — works
    — Detecting C compiler ABI info
    — Detecting C compiler ABI info – done
    — Check for working CXX compiler: /usr/bin/c++
    — Check for working CXX compiler: /usr/bin/c++ — works
    — Detecting CXX compiler ABI info
    — Detecting CXX compiler ABI info – done
    — Configuring done
    — Generating done
    — Build files have been written to: /home/keghn/armadillo-4.650.4/0p/single-layer-rbm-master

    $ make
    [100%] Building CXX object CMakeFiles/RBM.dir/RBM.cc.o
    In file included from /usr/include/c++/4.8/random:35:0,
    from /home/keghn/armadillo-4.650.4/0p/single-layer-rbm-master/RBM.cc:14:
    /usr/include/c++/4.8/bits/c++0x_warning.h:32:2: error: #error This file requires compiler and library support for the ISO C++ 2011 standard. This support is currently experimental, and must be enabled with the -std=c++11 or -std=gnu++11 compiler options.
    #error This file requires compiler and library support for the \
    ^
    /home/keghn/armadillo-4.650.4/0p/single-layer-rbm-master/RBM.cc: In function ‘void read_Mnist(std::string, std::vector<arma::Mat >&)’:
    /home/keghn/armadillo-4.650.4/0p/single-layer-rbm-master/RBM.cc:50:41: error: no matching function for call to ‘std::basic_ifstream::basic_ifstream(std::string&, const openmode&)’
    ifstream file (filename, ios::binary);
    ^
    /home/keghn/armadillo-4.650.4/0p/single-layer-rbm-master/RBM.cc:50:41: note: candidates are:
    In file included from /usr/include/armadillo:21:0,
    from /home/keghn/armadillo-4.650.4/0p/single-layer-rbm-master/RBM.cc:10:
    /usr/include/c++/4.8/fstream:467:7: note: std::basic_ifstream::basic_ifstream(const char*, std::ios_base::openmode) [with _CharT = char; _Traits = std::char_traits; std::ios_base::openmode = std::_Ios_Openmode]
    basic_ifstream(const char* __s, ios_base::openmode __mode = ios_base::in)
    ^
    /usr/include/c++/4.8/fstream:467:7: note: no known conversion for argument 1 from ‘std::string {aka std::basic_string}’ to ‘const char*’
    /usr/include/c++/4.8/fstream:453:7: note: std::basic_ifstream::basic_ifstream() [with _CharT = char; _Traits = std::char_traits]
    basic_ifstream() : __istream_type(), _M_filebuf()
    ^
    /usr/include/c++/4.8/fstream:453:7: note: candidate expects 0 arguments, 2 provided
    /usr/include/c++/4.8/fstream:427:11: note: std::basic_ifstream::basic_ifstream(const std::basic_ifstream&)
    class basic_ifstream : public basic_istream
    ^
    /usr/include/c++/4.8/fstream:427:11: note: candidate expects 1 argument, 2 provided
    /home/keghn/armadillo-4.650.4/0p/single-layer-rbm-master/RBM.cc: In function ‘void save2txt(arma::mat&, std::string, int)’:
    /home/keghn/armadillo-4.650.4/0p/single-layer-rbm-master/RBM.cc:109:16: error: ‘to_string’ is not a member of ‘std’
    string s = std::to_string(step);
    ^
    make[2]: *** [CMakeFiles/RBM.dir/RBM.cc.o] Error 1
    make[1]: *** [CMakeFiles/RBM.dir/all] Error 2
    make: *** [all] Error 2

    I do not know if i am using Cmake right everyone i know avoid it like the plague.

Post a Comment

Your email is never published nor shared. Required fields are marked *

You may use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*
*