• Nick

    Hi
    Thanks for this post.
    Sorry but I used your code as such and it compiles fine with OpenCV. But when I run it, I get the dreaded ‘Segmentation fault’.

    • Eric

      Hi Nick,
      When does the SegFault happen? I just tried to run it, no fault happened. Can you find specifically in which function it happens?

      • Dov Bai

        The code can segfault on lines like this:

        int randomNum = ((long)rand() + (long)rand()) % (data.cols – batch);

        because the sum of two longs may overflow a long and result in a negative randomNum which is not legal in Rect.

        Modifying to rand()%(data.cols – batch) works OK.

        • Eric

          Thank you Dov 🙂

  • Adil

    Hi,

    I tried to run your code, but unfortunately I don’t know how can I did that by using the regular C++ or visual C++, could you please help me on that… I really appreciate that

    • Eric

      Hi Adil,

      I’m not sure what do you mean by “using regular C++ or visual C++”, do you have problem with compiling this code?

  • Yoann

    Hi,

    Thanks for this very useful code. I have some questions to understand the capabilities of this code.
    First, I understand that the trainY table is composed of 3 different values (0, 1, 2). Is it possible to train the model using real values from -1 to 1?
    Second, I would like to know if it is possible, using your code, to combine different autoencoders such as in this picture: http://1drv.ms/1oj7vAi. I don’t know if such a combination has some benefits over a more common approach where all the features are combined in the same autoencoder but it would make sense in my work to do so.

    I plan to test your code soon, thanks again.

    • Eric

      Hi Yoann,

      First, the Y value in this code can be 0-9, and I don’t think it works if using real values as training labels.
      Second, the picture you showed seems like using some combination of two different network, this is a good idea, however, I think what’s better to do is to use something like Denoising Auto-encoder, that’s a simple and efficient way to implement combination of different networks (not only two networks, but thousands of it).

      🙂

      • Yoann

        Thanks for your answer!

        I will definitly implement this denoising autoencoder when i’ll find a code as simple and efficient as yours but working with real values. 🙂

  • Yoann

    Hi Eric,

    I’ve not found on the web a code as simple as yours to use.
    What I plan to do is to use your code but replace the softmax regression by a logistic regression to output continuous values. In this post: http://eric-yuan.me/logistic-regression/you say that you implemented the logistic regression but that there are some mistakes in that code. Do you think that the code of the logistic regression you wrote could work with the code in this page? If it could work, I may debug the code if you are interested.

    • Eric

      Hey Yoann,

      About that buggy code I mentioned in that post, actually I have fixed it, and it is exactly the code in http://eric-yuan.me/bpnn/ . For the second question, I don’t think logistic regression works, because it is more likely to be a classifier, it uses non-linearity process inside (softmax regression uses the same idea). I’m not sure but my suggestion is, search online about neural network + LINEAR regression. Good luck!

  • Tagore

    Hi,

    Thank you for your sharing,I am puzzled by one question.I think I was wrong,but I see you train the two hiddened layers in one “for-loop”.How did you train the second hiddened layer by using the activations of the first hiddened layer?
    I am a beginner,and I appreciate your code so much,waiting for your reply!
    Thank you again!

    • Eric

      Hi Tagore,

      Sorry I didn’t quite catch what you said, do you mean in the fine-tune training process?

      • Tagore

        Thank you for your reply, I made a misunderning of your sparseAutoencoder training part,and I am now catch it!
        Thank you so much!

      • Tagore

        Hi Eric,
        Now one question puzzeles me. Each AutoEncoder has params w2 and b2,but we haven’t used them, so What is the function of them?
        Haven’t really catch the network, I am really wish for your help!
        Thank you!

        • Eric

          Hey Tagore,

          In auto-encoder, what we mainly concerns is the ‘encoder’ part, and the W2/b2 stuff are what we called decoder. If you see Sparse Coding, you may find the usage of decoder 🙂

  • cheong

    How to read image and label from my own database in your program? Thanks

    • Eric

      Hey,

      For images, just use opencv imread() function, and for other formats (.txt etc.), use regular C++ file operation functions.

  • Daniel

    Hi Eric, nice post. I try to run this on my ubuntu, this is the cmakelist:
    cmake_minimum_required(VERSION 2.8)
    project( MnistClassify )
    find_package( OpenCV REQUIRED )
    add_executable( MnistClassify MnistClassify.cpp )
    target_link_libraries( MnistClassify ${OpenCV_LIBS} )

    but give me some errors:
    /home/daniel/intelCode/MnistClassify.cpp: In function ‘void read_Mnist(std::string, std::vector&)’:
    /home/daniel/intelCode/MnistClassify.cpp:94:41: error: no matching function for call to ‘std::basic_ifstream::basic_ifstream(std::string&, const openmode&)’
    ifstream file (filename, ios::binary);
    ^
    /home/daniel/intelCode/MnistClassify.cpp:94:41: note: candidates are:
    In file included from /home/daniel/intelCode/MnistClassify.cpp:20:0:
    /usr/include/c++/4.8/fstream:467:7: note: std::basic_ifstream::basic_ifstream(const char*, std::ios_base::openmode) [with _CharT = char; _Traits = std::char_traits; std::ios_base::openmode = std::_Ios_Openmode]
    basic_ifstream(const char* __s, ios_base::openmode __mode = ios_base::in)
    ^
    /usr/include/c++/4.8/fstream:467:7: note: no known conversion for argument 1 from ‘std::string {aka std::basic_string}’ to ‘const char*’
    /usr/include/c++/4.8/fstream:453:7: note: std::basic_ifstream::basic_ifstream() [with _CharT = char; _Traits = std::char_traits]
    basic_ifstream() : __istream_type(), _M_filebuf()
    ^
    /usr/include/c++/4.8/fstream:453:7: note: candidate expects 0 arguments, 2 provided
    /usr/include/c++/4.8/fstream:427:11: note: std::basic_ifstream::basic_ifstream(const std::basic_ifstream&)
    class basic_ifstream : public basic_istream
    ^
    /usr/include/c++/4.8/fstream:427:11: note: candidate expects 1 argument, 2 provided
    /home/daniel/intelCode/MnistClassify.cpp: In function ‘void read_Mnist_Label(std::string, cv::Mat&)’:
    /home/daniel/intelCode/MnistClassify.cpp:125:41: error: no matching function for call to ‘std::basic_ifstream::basic_ifstream(std::string&, const openmode&)’
    ifstream file (filename, ios::binary);
    ^
    /home/daniel/intelCode/MnistClassify.cpp:125:41: note: candidates are:
    In file included from /home/daniel/intelCode/MnistClassify.cpp:20:0:
    /usr/include/c++/4.8/fstream:467:7: note: std::basic_ifstream::basic_ifstream(const char*, std::ios_base::openmode) [with _CharT = char; _Traits = std::char_traits; std::ios_base::openmode = std::_Ios_Openmode]
    basic_ifstream(const char* __s, ios_base::openmode __mode = ios_base::in)
    ^
    /usr/include/c++/4.8/fstream:467:7: note: no known conversion for argument 1 from ‘std::string {aka std::basic_string}’ to ‘const char*’
    /usr/include/c++/4.8/fstream:453:7: note: std::basic_ifstream::basic_ifstream() [with _CharT = char; _Traits = std::char_traits]
    basic_ifstream() : __istream_type(), _M_filebuf()
    ^
    /usr/include/c++/4.8/fstream:453:7: note: candidate expects 0 arguments, 2 provided
    /usr/include/c++/4.8/fstream:427:11: note: std::basic_ifstream::basic_ifstream(const std::basic_ifstream&)
    class basic_ifstream : public basic_istream
    ^
    /usr/include/c++/4.8/fstream:427:11: note: candidate expects 1 argument, 2 provided
    make[2]: *** [CMakeFiles/MnistClassify.dir/MnistClassify.cpp.o] Error 1
    make[1]: *** [CMakeFiles/MnistClassify.dir/all] Error 2
    make: *** [all] Error 2

    • Eric

      It seems that something wrong with the ifstream part, maybe in Ubuntu, ifstream::open won’t accept string as a filename, so try to change “ifstream file (filename, ios::binary);” into “ifstream file (filename.c_str(), ios::binary);”, thanks.

  • Kankan Dai

    Thanks for the code! But I’m confused about line 138: mat.ATD(0, i) = (double)temp; It shows in my openCV, there is no ATD() method for class Mat?

    • Kankan Dai

      Oh It is a Macro

      • Eric

        🙂

  • John W Jones

    Hi, I’ve been trying to debug the code above for use (with attribution) for my master’s degree final project. I have it compiled on MSVC++ 2013 Pro with OpenCV 2.4.9.

    I get an MSVCP120D.dll Debug Assertion Fail on a vector where vector subscript out of range is named.

    Have you any idea what may have caused this and, hopefully, how to correct? Thanks in advance, jj.

  • John W Jones

    On line 577 above, vec.resize(number_of_images, cv::Mat(28, 28, CV_8UC1));, is commented out. That will cause the error I wrote of above. Un-comment; runs fine…

  • Miler

    Hi Eric, your code is great, but you can explain me why this lines:
    sa.W1 = sa.W1 * (2 * epsilon) – epsilon;
    sa.W2 = sa.W2 * (2 * epsilon) – epsilon;
    sa.b1 = sa.b1 * (2 * epsilon) – epsilon;
    sa.b2 = sa.b2 * (2 * epsilon) – epsilon;

    • Miler

      Forget it, this is the Parameters’s Initialization in ML. Thank you.

      • Miler

        I found a mistake:

        trainSparseAutoencoder(tmpsa, tempX, 600, 3e-3, 0.1, 3, 2e-2, 100, tamanoBatch);

        the number 600 is wrong

        in the second autoencoder, you dont do dimensional reduction

        number of units (2 autoencoders + 1 softmax)
        784 -> 600 -> 600 -> 10

  • ngapweitham

    I find an equation differ with(weight decay) the tutorial of UFLDL

    In the function “sparseAutoencoderCost”

    // the second part is weight decay part
    double err2 = sum(sa.W1)[0] + sum(sa.W2)[0];
    err2 *= (lambda / 2.0);

    But according to the equation, it should be

    cv::pow(sa.W1, 2.0, sa.W1);
    cv::pow(sa.W2, 2.0, sa.W2);
    double err2 = sum(sa.W1)[0] + sum(sa.W2)[0];
    err2 *= (lambda / 2.0);

    The weights of the equation should be the power of 2, is this a bug of the codes or my misunderstanding?Thank you

    double err2 = (cv::sum(es.w1_)[0] + cv::sum(es.w2_)[0]) *
    (params_.lambda_ / 2.0);

  • M. Plav

    Hi,
    I am a newbie but found it very useful. Can you please tell me a way to write it in Matlab. Moreover, I want to train stacked autoencoders from one type of images say neon faces and predict normal faces how can I do it? I am thinking of using SparseDenoisingAutoencoders with stochastic gradient descent and may be KNN as classifier or MSE for ranking output matching results. make sense? thanks

  • Việt Anh

    One small thing :

    Need to #include to use CLOCKS_PER_SEC and clock().

    P/s : Thank you very much !

    • Việt Anh

      #include time.h

  • Shelby

    Hi Eric,

    I built a 3 layer stacked RBM and trained it completely, now how can i test it?
    i used matlab for this.

    thank you in advance.

  • sakazi

    Hello Eric,
    could you provide this code using Matlab?

    thanks