Over the past few weeks I’ve been spending some time looking at LSTM networks using CNTK. LSTM (long short-term memory) networks are useful when predicting sequences, such as the next word in a sentence when you know the first few words. Regular neural networks can’t easily deal with sequences because NNs have no memory – each input is independent. LSTMs have a form of memory so they can deal with sequences.
The CNTK v2 library is a code library of sophisticated deep neural network modules, including LSTMs. Rather than try to code a CNTK LSTM demo on a word sequencing problem, I figured it’d be easier to work with plain numeric data.
My target problem was to create a predictive model of the trigonometric sine function. Obviously this isn’t useful, but it’s a good, simple problem — I wanted to focus on understanding LSTMs without getting distracted by details of a realistic problem.
Somewhat to my surprise, I discovered that the CNTK documentation had an example of predicting the sine function using an LSTM. Easy!
Well, not so easy. The documentation example was quite difficult to understand. So I set out to deconstruct the documentation example one chunk of code at a time, figure out what each chunk of code did, and then reconstruct the example from scratch, removing all peripheral code that deal with data generation, plotting and so on.
It took a bit of time, but I eventually got a model up and running. In the image, you can see the LSTM model predicts the sine function fairly well, as you’d expect for an easy problem.
My knowledge of the CNTK library is slowly but surely building up. At some point I should probably put together a long document, or a short e-book, that walks through CNTK installation, logistic regression, neural networks, LSTM networks, and convolutional neural networks.