In the last post I described how deep neural networks, particularly Recurrent Neural Networks (RNN), can be used on input with varying sizes or sequences. In this post I’m going to describe an example of an RNN and show the results it produces. Before I do that I’m going to explain what is in the RNN block which I called ‘neural network’ in the previous post.
Fig 1.0 RNN: A Recurrent Neural Network with an input sequence of four letters drawn on a 7×7 grid. The neural network starts with the first input on the left, produces two result (an output, #, and a clue, s), then it moves on the next input. The neural network uses the clues to guess the word in the sequence and it reports its guesses in the output at the top.
The neural network block in an RNN is called an RNN unit. It is similar the neural network from the first post with some minor differences. Instead of taking one set of input values and producing one set of result, an RNN unit takes in two values -the input and the clue- and produces two results -an output and another clue. Here is an illustrated example of the steps that the RNN in Figure 1.0 would take to produce a pair of output value and clue.
To calculate the new clue, the neurons in layer 1 multiple each value from the sequence input as well as the old clues with a parameter and add the results together. Each neuron has a parameter for each of the 49 input values and also a parameter for each of the clue values.
Sequence input: 0 x parameter1
+ 0 x parameter2
+ 0 x parameter49
= input subtotal
Clue: -0.246 x parameter50
+ 0.342 x parameter51
+ (-1.010) x parameter52
= clue subtotal
Total = input subtotal + clue subtotal + bias
Just like the neuron we have seen before, the RNN neurons apply an activation to the total. The activation that is commonly used for RNN is called tanh and it gives results that are between -1 and 1.
New clue = tanh ( Total )
Since each neuron in layer 1 produces one value for the new clue, there are as many values in the clue as there are neurons in layer 1.
Finally, the neurons in layer 2 produce an output, but unlike the previous layer, no activation used.
Using the three steps above, I used Andrej Karpathy’s code to train an RNN to guess the next character or letter of a sentence given the first letter. I trained the RNN using works written by Shakespeare, so the RNN should hopefully produce sentence that look like a Shakespeare’s work. The first letter of the sentence is fed into the input of the RNN unit to get it to guess the next letter. Whatever the RNN unit guesses as the next letter is also fed into the RNN unit along with the clue from the previous letter. This process can be repeated as often as desired.
Before I show the results, here is a snippet of the text used to train the RNN:
Enter ANTONY and EROS
ANTONY. Eros, thou yet behold’st me?
EROS. Ay, noble lord.
ANTONY. Sometime we see a cloud that’s dragonish;
A vapour sometime like a bear or lion,
A tower’d citadel, a pendent rock,
A forked mountain, or blue promontory
With trees upon’t that nod unto the world
And mock our eyes with air. Thou hast seen these signs;
They are black vesper’s pageants.
EROS. Ay, my lord.
ANTONY. That which is now a horse, even with a thought
The rack dislimns, and makes it indistinct,
As water is in water.
EROS. It does, my lord.
After the first rounds of training, the RNN produced gibberish:
}>T1iOo yF}gf1;Fl A1sF8F”1I1?a&FU1>1. LFjF61) &Fd d1>]<F0Fpn}1C;tvT1DGXFyFbUq1″FP1HF)Fx1?KL1wzaFSFjF.1u1E,nF5n.1WF) J1|Fxkv19Fx7C7& SFE1n:wF” S )F(SD1|1AFWF>Fe1GFp T}m 71:F61HFl1K1lFf h1″FQZ`pGrv1! 6R
Slowly, the gibberish started to change to nonsense words after 1100 cycles of training. Paragraphs started to appear and the length of the lines started to resemble the source text:
k ouw nh dro la a ios,
hy, h i e nqG r eeeu-omo gt
og yus o gys isJo e s oked
s e sa Fnaheur esrb nLatea te t b h s,a e l nlea nuer Wtee
enng o g ls<s a d ono tf r vh
After a couple hours, some words appear and formatting of sentences and paragraphs starts to resemble the original text.
Cleret will difdens you letteadle th’ll that splech candond am the bead. Iy brides forcengary meards!
by grebhounch: genition amendal this the, are so suck.
And evers live her wands your of that thai’d wisburicuspsting kny,
Whit thou printious to whille is a fanod
Weel rithe fitil’d ‘As yet Hat?
KING furrsiav to of to tro.
Sorstemstyen And will
As des his truplly bacicholes this gent your of trown-et? Havild?
It appears the RNN is learning so much simply from picking up patterns in the order of characters. I will share the RNN so you can try it for yourself. For the next post, I will use a more powerful RNN to get better results faster.