Author Archives: jamesdmccaffrey

Principal Component Analysis (PCA) From Scratch Using Python

If you have some data with many features, principal component analysis (PCA) is a classical statistics technique that can be used to transform your data to a set with fewer features. This is called dimensionality reduction. For example, suppose you … Continue reading

Posted in Machine Learning | Leave a comment

Logistic Regression Using PyTorch with L-BFGS in Visual Studio Magazine

I wrote an article titled “Logistic Regression Using PyTorch with L-BFGS” in the June 2021 edition of the online Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2021/06/23/logistic-regression-pytorch.aspx. Logistic regression is one of many machine learning techniques for binary classification — predicting one … Continue reading

Posted in PyTorch | Leave a comment

Using Built-In Library Functions vs. Using A Custom Function

I ran into some code that made me think about the nature of coding and computer science. I was reviewing some PyTorch documentation example code and saw a statement that looked something like: foo = torch.tensor(foo[:-1]).cumsum(dim=0).to(device) From the context of … Continue reading

Posted in Miscellaneous | Leave a comment

Tokenizing Text Using the Basic English Algorithm

In natural language processing (NLP) problems you must tokenize the source text. This means you must split each word/token, usually convert to lower case, replace some punctuation, and so on. In Python, the spaCy library and the NLTK (natural language … Continue reading

Posted in Machine Learning | Leave a comment

Sentiment Analysis using a PyTorch Neural Network with an EmbeddingBag Layer

In computer science, and life, it helps to be smart but it’s also important to have determination. I’m not the smartest guy in the Universe, but once a problem gets stuck in my head it will stay there until it … Continue reading

Posted in PyTorch | 2 Comments

Researchers Explore Techniques to Reduce the Size of Deep Neural Networks on Pure AI

I contributed to an article titled “Researchers Explore Techniques to Reduce the Size of Deep Neural Networks” in the June 2021 edition of the online Pure AI Web site. See https://pureai.com/articles/2021/06/02/reduce-networks.aspx. The motivation for reducing the size of deep neural … Continue reading

Posted in Machine Learning | 2 Comments

Top Ten Science Fiction Movies with Pterodactyls

There are many science fiction movies that features Pteranodons, pterosaurs, and pterodactyls. Scientifically they’re different but basically they’re all flying reptiles. Here are ten of my favorites, listed in no particular order. 1. Rodan (1956) – Japanese miners dig a … Continue reading

Posted in Top Ten | Leave a comment

Serving Up PyTorch Training Data Using The DataLoader collate_fn Parameter

When creating a deep neural network, writing code to prepare the training data and serve it up in batches to the network is almost always difficult and time consuming. A regular PyTorch DataLoader works great for tabular style data where … Continue reading

Posted in PyTorch | Leave a comment

Integrated Gradient Intepretability Explained

One of many techniques to explain why a neural network produced a particular prediction is called integrated gradient. The idea is difficult to understand if you’re not familiar with it. So I’ll try to give an informal (as possible) explanation. … Continue reading

Posted in Machine Learning | Leave a comment

An Example of the Python SciPy line_search() Function

The term “line search” refers to a classical statistics technique to minimize a function. The Python SciPy code library has a line_search() function that is a helper function for a line search but line_search() doesn’t actually do a line search. … Continue reading

Posted in Machine Learning | Leave a comment