I never completely understand some technology concept unless I can code it up from scratch. I’ve been exploring LSTM (long short-term memory) cells for several weeks. LSTM cells are complex software components that can be used to create a recurrent neural network — a prediction system that has state (a memory) so you can predict the next item in a sequence.
I finally felt I knew enough to code a demo up from scratch — and solidify my confidence that I fully understand LSTM cells. There are about half a dozen well-known Web articles on LSTMs. They all explain LSTMs from a different perspective. Somewhat surprisingly to me, the Wikipedia article on LSTMs is very good (many Wikipedia articles on machine learning topics are not good at all) so I used Wikipedia as my primary reference.
Anyway, I used Python and coded an LSTM from scratch, following the Wikipedia article as closely as possible — same variable names, etc. The process was quick and easy, although I’m not sure how easy things would have been without the hours of background time I spent reading about LSTMs.
My demo coded the LSTM input-output process. I set fixed weights and biases to arbitrary constants. A full explanation of the IO process would take pages, so, I’ll write the full process up someday when I have a free full day.
Every software developer and researcher I know has a driving curiosity to understand things, and then use that knowledge to create things. Every company I’ve ever worked for has, quite correctly, told employees how important it is to “be continuous learners” or “have a growth mindset” or similar. My colleagues have always kind of smiled internally at these deep words of wisdom — continuous learning is literally wired into our DNA and we find it difficult to understand how anyone can’t be driven by the search for new knowledge and skills.