Recurrent neural networks are very famous recently, they play as important roles as convolutional neural networks. RNNs can use their internal memory to process arbitrary sequences of inputs, so not only images, they work well on speech recognition and natural language processing tasks.
There are several type of RNNs, as the beginning, we focus our attention on only Elman-type RNNs (similar with Jordan type, they are both the simplest type of RNNs) in this post, and I’ll introduce and implement other advanced types of RNNs in the future parts of this RNNs series.
By using BPTT (Back-Propagation Through Time) method, we can unfold the networks to be some structure quite similar with regular feed forward network, say we have a recurrent neural network which time delay is 3 and has n hidden layers, then after applying BPTT method, we can get the following structure.
For each hidden layer, it shares a unique weight W (weight between two time slot) and weight U (weight between hidden layers). With this unfolded version of network, it’s easy to train all these parameters using back-propagation method.
2. Forward pass
For the first hidden layer:
in which, f represents non-linear function, such as sigmoid function, tanh, or ReLU function. And for other hidden layers:
For the output layer:
in which, g represents output non-lineariry, such as softmax.
For the first time slot, we can suppose there are zeros before them as the previous S values.
3. Backward pass
For backward pass:
in which, a represents the output of each hidden layer before applying non-linear function in forward pass.
Code and Test
I implemented it using C++ and OpenCV.
I used the same toy dataset to test my RNNs as I used in HERE.
Before feeding data into networks, I used the simplest encoding method, I just change each word into 1-of-N code (which has 1237 dimensions in the given toy dataset), and by using the github config, it got 0.986743 of training accuracy and 0.951342 of test accuracy within 43 minutes using my 2015 Macbook Air. I’m sure it will work better by using advanced encoding methods.
This will be a series of post about RNNs, I’ll try to introduce Bi-Directionality and Long Short Term Memory into it 🙂