• Pingback: ()

  • Min

    good job!

  • Alex

    nice work!!

  • Alex

    hi, if the input is color-image, how to design the filters and the whole net?

  • Phuc

    I have an error on
    line 1147: start = clock();
    line 1188: end = clock()
    line 1189:cout << "Totally used time: " << ((double)(end – start)) / CLOCKS_PER_SEC << " second" << endl;
    How to fix this ? Can you help me ?

    • Eric

      Hey Phuc,
      You can delete the time counter, or, try to include “time.h”. 🙂

  • Yao

    Hi, I’m wondering how long did it take for training on MNIST ?

    • Eric

      Hey Yao,

      It depends on how many layers you choose, I tried 3 Conv layers with pooling, and 2 full connected layers, it took hours…

    • Eric

      Besides, it also depends how many units per layer, u know…

  • teddy

    Hello,
    Don’t you have any plan to convert these code to android?
    I think it’s very useful in mobile too. Could you please convert this to android?

    • Eric

      Sorry actually I don’t know much about how to convert code to android, however, first, I’m not sure whether an android-device is powerful enough to run deep neural networks; second, there is OpenCV for Android SDK, you can check http://opencv.org/platforms/android.html if you want.

  • teddy

    Hello.
    Thanks for your reply.
    I’ve converted your code into android in anyway. It works fine even though it works slowly in device.
    And I want to save trained weights. Is that the same as version 1 using saveWeight() in this codes?
    And how to load the trained data and classify test images using the trained data?

    • Eric

      Hey Teddy,

      You already converted into Android version? Well done!
      Yes you can use the “saveWeight()” for saving a cv::Mat matrix into .txt file, however I’m not sure whether you should do this in Android.
      Use the right Android-style saving method, save all the trained matrix (including Convolution layers and full connected layers), and all parameters, such as lambda, amount of units in hidden layer and so on. When testing, just load these things in, and do what “resultProdict()” function does.
      Cheers!

      • teddy

        Thanks for reply!!
        I’ll try what you said like!! Good luck!

  • teddy

    i have a question in your code.
    In resultProdict()….. there are for loops like this.

    for(int i=0; i<10; i++){
    double maxele = tmp.ATD(0, i);
    int which = 0;
    for(int j=1; j maxele){
    maxele = tmp.ATD(j, i);
    which = j;
    }
    }
    result.ATD(0, i) = which;

    Why j starts from 1 not 0?
    Is tmp the result of 10000 test images, right?
    So I think it needs to check first element.. now your codes compares only 9 elements in mat for each column..
    Please confirm this issue…

    Thanks.

    • Eric

      Hey Teddy,
      Here what I’m doing is, for every ith COLUMN, I want to find in which ROW it has max value. So in each ith column (the outer loop), I set the initial max value as the 0th row’s value (maxele), and set the position of initial max value 0 (which); then I just need to check the rest of rows of ith column. If there’s nothing bigger than the initial max value, then the result is the initial value.
      In brief, if I set the 0th value as initial value, then it is unnecessary to compare the initial value with itself.
      Hope the above comment helps.

  • teddy

    Hello.
    Thanks for your reply.
    Well.. I checked working resultProdic() in android. It works well.
    And I hope to use multi-channels(3, rgb) input images.. but your code is based on 1 channel.
    How can I fix it based on your code? Any idea?

    • Eric

      You can just combine the intensities from all the color channels for the pixels into one long vector, as if you were working with a grayscale image with 3x the number of pixels as the original image.

      • teddy

        Oh i see, for example, if I use 32×32 rgb color images, I need to make vector, the mat size is 32*(32+32+32) for each single image, right? currently, mnist image(28×28) so, mat size is 28×28 in loading images, read_Mnist().

        • Eric

          Yes exactly. However I don’t think this is the best way although this is the only way I know, because I think there are connections among the three channels, they are closely related. So it’s likely we are losing some information if we simply do like that. Maybe a better way is just use 32 * 32 * 3, and make the whole network one more dimension? But you know, It’s hard to figure out and hard to implement especially using OpenCV. 🙂

  • teddy

    Well, I’m trying to change the dataset, so I load some data like below..

    Read trainX successfully, including 784 features and 39426 samples.
    Read trainY successfully, including 39426 samples

    then, when training….
    I got this message…


    Network Learning…………….
    *** glibc detected *** ./cnn: corrupted double-linked list: 0x00000000024c8240 ***
    Segmentation fault (core dumped)

    When I’m tracking the point of error, it’s randomly occured. sometimes in pooling(), or convandpooling()…
    cnn in the log is the binary name, my training image width/height 28*28, it’s the same as MNIST, the different is the size of training images. the catgegory number is the same as mnist too.
    Please give me a tip where I need to see in the codes?

    • Eric

      Is your trainX binary, I mean the same form as Mnist?

      • teddy

        Yes, it’s my own traing data.. it’s not the same as mnist..

      • teddy

        I think it’s not a problem of the number of training images. I made trainX and trainY with 60K images/lables.. But the same error.. I’ll check the data of training..once again.

      • teddy

        sorry, fogot this reply. I ,missed some operation before training.. after adding it, it works fine.
        Thanks!

  • teddy

    When I want 3 convolving layer and 3 pooling layers, which codes should be added?
    I pushback KernelSize, KernelAmount, PoolingDim 3 times..

    KernelSize.push_back(3);
    KernelSize.push_back(5);
    KernelSize.push_back(7);
    KernelAmount.push_back(2);
    KernelAmount.push_back(4);
    KernelAmount.push_back(8);
    PoolingDim.push_back(2);
    PoolingDim.push_back(2);
    PoolingDim.push_back(2);

    and, I changed gloval variables like this.

    int NumHiddenLayers = 3;
    iint NumConvLayers = 3;

    However, I got the error… where in conv2() in cv2…
    Do I have to add some codes for using 3 convolving, 3 pooling layers??

    • teddy

      I found the error reason, it’s the input dimention size of each layer..
      Sorry to bother you 🙂

  • teddy

    In this code, how many trainable parameters are there?
    Can I count that number of paremeters like this?

    C1 = 5x5x4+4+4 = 108
    P1 = 0
    C2 = 7x7x8+8+8 = 408
    P2 = 0
    FC1 = 200
    FC2 = 200
    SOFTMAX = 10
    ——————————– total : 926

    Is it right?

    • Eric

      Hi Teddy,

      For the two full-connected layers, the amount of trainable parameters is actually the weight matrices, so for each layer, the number should be (last_layer_outputs * this_layer_neuron_amount).

      • teddy

        Oh, I see, you mean that fc1 parameters are 8×200, fc2 parameters are 200×200, right?

  • teddy

    oh, that’s wrong, I think the next is right, p2-fc1 is 1×200, fc1-fc2 is 200×200, fc2-softmax 10, correct?

  • DengYu

    Hi, Eric
    I want to use my data to train and test the code. Is there any request for the size of image or length of the training samples? Thank you very much

    • Eric

      Hey,
      There’s no constraint about size or length, but there are something you need to pay attention on:
      1. This code only support single-channel image so far.
      2. You need to modify the parameters of Conv layer and Pooling layer properly, such as kernel size and poolingDim.
      3. Inside readData(), I divided the whole dataset by 255.0 because I want the range of data to be (0, 1), so if your training data already has the range (0, 1), then disable that part of my code.

      have fun.

  • albert.mg

    I hope to use your code to train CIFAR10,but after known that this code only support single-channel images. And if I convert CIFAR10 into single channle image for training,I think the percision would be very low. Is there any way to train color imgaes?

    • Eric

      You can just combine the intensities from all the color channels for the pixels into one long vector, as if you were working with a grayscale image with 3x the number of pixels as the original image. Another idea, use 3d matrix instead of 2d.

  • Arghavan

    Hi Eric
    I really appreciate your work on CNN. I used our code in Qt for MNIST dataset it works well all of a sudden but gives me an error : Out of Memory. I don`t know how fix it. Is there any solution for that?
    thanks a lot.

    • Eric

      Hi Arghavan,

      How much memory do you have in your computer? If you use large networks, it indeed need large memory, moreover, I used hash table in this code so it need more memory.

      • Arghavan

        I have 4G RAM and use default CNN you build. Is it possible to ask about your computer you run code on it?

  • Yu Deng

    Hi, Eric
    Can I bulid the code with Visual Studio 2010? Thanks for your help very much.

    • Eric

      Hi Yu,
      Yes I think so, as long as you have correctly installed OpenCV on it.

  • zhenghx

    I has read your code for several days and finished now. I found this code is different with your first version “http://eric-yuan.me/cnn/”, such as the softmax regression way and the learning rate. Your first version’s regression way is according to UFLDL’s “http://ufldl.stanford.edu/wiki/index.php/Softmax_Regression” but not this one.
    This code works wonderful because the convergence is very fast, so I want to read the paper or material witch you refer to. Can you give me a list ?

    • zhenghx

      v_hl_W[i] = v_hl_W[i] * Momentum + lrate * HiddenLayers[i].Wgrad;
      v_hl_b[i] = v_hl_b[i] * Momentum + lrate * HiddenLayers[i].bgrad;
      HiddenLayers[i].W -= v_hl_W[i];
      HiddenLayers[i].b -= v_hl_b[i];
      according to UFLDL’s, I change the code to
      HiddenLayers[i].W -= lrate * HiddenLayers[i].Wgrad;
      HiddenLayers[i].b -= lrate * HiddenLayers[i].bgrad;
      and then the convergence speed become very slow.

  • Ashok

    Linux Make error: ConvNet.cpp:141:40: error: no matching function for call to ‘std::basic_ifstream::basic_ifstream(std::string&, const openmode&)’ ifstream file(filename, ios::binary);
    ….

    Just replaced
    ifstream file(filename, ios::binary);
    with
    ifstream file(filename.c_str(), ios::binary);

    and works fine!!!!

    • Eric

      Thanks Ashok 🙂

  • Lancelod Liu

    Thanks a lot, I am looking for the cpp source code converted from the python version. There’s a problem. When I run your code in VS 2012 (opencv 2.4.9), it gave me a hint whose expression was “vector subscript out of range”. And there’s no output in the command window which means that cout<<"Read trainX successfully, including "<<trainX[0].cols * trainX[0].rows<<" features and "<<trainX.size()<<" samples."<<endl; didn’t run.
    I tried to use the full path of the mnist file but it failed the same.
    It’ll be great help if there’s any hint you could provide.

    • Lancelod Liu

      I tried to add cout<<"open successfully\n" in the function readMnist() and it didn’t show up. I think maybe there’s something wrong with the read function.

      • Lancelod Liu

        I fixed this problem by using
        readData(trainX, trainY, "mnist\\train-images-idx3-ubyte.gz", "mnist\\train-labels-idx1-ubyte.gz", 60000);
        readData(testX, testY, "mnist\\t10k-images-idx3-ubyte.gz", "mnist\\t10k-labels-idx1-ubyte.gz", 10000);

        But another problem showed up as below.

        OpenCV Error: One of arguments’ values is out of range (The total matrix si
        es not fit to “size_t” type) in cv::setSize, file C:\buildslave64\win64_amd
        _4_PackSlave-win32-vc11-shared\opencv\modules\core\src\matrix.cpp, line 126

        • Lancelod Liu

          I fixed it…Apparently I should un-zip gz file before I read them…How could I simply read the ‘*.gz’ file and expect a right answer…

  • Angus

    您好,我看到您这篇文章受益匪浅!

    • Eric

      多谢支持。

  • 赵元兴

    您好,请问local contrast normalization 您是否有过尝试?对于正向,这一层如果在单feature map上做 map不是又小了一圈么? 这个需要特殊处理么? 这一层的参数貌似都是自己指定的 反向传播又应该如何做呢?请问有没有相关文章推荐呢? 谢谢

    • Eric

      你好!当时尝试了,但是后来貌似有问题,也没来得及查,就直接禁掉了。我记得Hinton的imagenet那篇论文里有说,还有就是当时看了cuda-convnet的tutorial里面说的一些东西。

      • 赵元兴

        谢谢,我看到你的代码中gradient checking部分 做了两个导数的比值 (tp / grad.ATD(i, j))但是请问阈值为多少的时候认为是合格能?

        • Eric

          非常接近1.0就行

          • 赵元兴

            谢谢,我试试看~

  • Muralidhar

    Hi,
    I am trying to run code, but i am getting error saying segmentation fault (core dumped) after training dataset, Please let me what I need to change in code, or if I am doing any thing wrong.

    • Muralidhar

      Hi,
      Problem is it is not able to create folder kernel, What could be the reason. Because of that it is showing segmentation fault, Please let me know how to debug.

      • Muralidhar

        Problem was I need to create kernel folder manually.
        Thank you so much for providing code.

        • Eric

          Hi Muralidhar,

          Maybe because I’m currently using a Mac and most of the methods in my code are Mac-supported, especially files/folders related part.

  • Wei Chee

    Hi, I tried to implement your code on Microsoft Visual Studio 2013 but I keep getting this error . Hope you can help me.
    Error 1 error C2065: ‘S_IRWXU’ : undeclared identifier
    Error 2 error C2065: ‘S_IRWXG’ : undeclared identifier
    Error 3 error C2065: ‘S_IROTH’ : undeclared identifier
    Error 4 error C2065: ‘S_IXOTH’ : undeclared identifier

    All these errors occured at line 64 in save_weights.cc

    • Olive Lee

      Hi, I meet the same problem as you.
      Did you fix it now?

  • Kushal

    Sir I am doing Image processing Face recognition!
    I have tried using Eigen/ LBPH Face recognition algorithms in opencv.. but got no result.. I want to use CNN for for Image recognition part.. Your code seems pretty large and vast.. Can u pleasse guide me. . how to input my Image data.. , how to get feature vector and train into CNN.. . how to use it in prediction..
    Sir thanks in advance.. I am using C++ with opencv 3.1.0
    Hoping to hear from you. 🙂

  • Justin

    Hello Eric,

    Thanks for sharing this code. I have trouble to run this code somehow. I’m using opencv 2.4 in visual studio. When I learned this code, it says that POOL_STOCHASTIC is undefined and CLOCK. I include , so the error regard clock is gone, but I still don’t know how can I make POOL_STOCHASTIC to be able. Please advice me.
    Thanks!

    • yinsua

      there is two POOL_MAX, i think should replace one with POOL_STOCHASTIC

  • Gözde

    Hello Eric
    Thanyo so much this wonderful work
    I want to compile this code but I have aproblem “vector subscript out of range” (train x and train y is always zero so I have this problem ) How to can I manage this problem
    Thank you so much