Category Archives: Machine Learning

Principal Component Analysis (PCA) From Scratch Using Python

If you have some data with many features, principal component analysis (PCA) is a classical statistics technique that can be used to transform your data to a set with fewer features. This is called dimensionality reduction. For example, suppose you … Continue reading

Posted in Machine Learning | Comments Off on Principal Component Analysis (PCA) From Scratch Using Python

Tokenizing Text Using the Basic English Algorithm

In natural language processing (NLP) problems you must tokenize the source text. This means you must split each word/token, usually convert to lower case, replace some punctuation, and so on. In Python, the spaCy library and the NLTK (natural language … Continue reading

Posted in Machine Learning | Comments Off on Tokenizing Text Using the Basic English Algorithm

Researchers Explore Techniques to Reduce the Size of Deep Neural Networks on Pure AI

I contributed to an article titled “Researchers Explore Techniques to Reduce the Size of Deep Neural Networks” in the June 2021 edition of the online Pure AI Web site. See https://pureai.com/articles/2021/06/02/reduce-networks.aspx. The motivation for reducing the size of deep neural … Continue reading

Posted in Machine Learning | 2 Comments

Integrated Gradient Intepretability Explained

One of many techniques to explain why a neural network produced a particular prediction is called integrated gradient. The idea is difficult to understand if you’re not familiar with it. So I’ll try to give an informal (as possible) explanation. … Continue reading

Posted in Machine Learning | Comments Off on Integrated Gradient Intepretability Explained

An Example of the Python SciPy line_search() Function

The term “line search” refers to a classical statistics technique to minimize a function. The Python SciPy code library has a line_search() function that is a helper function for a line search but line_search() doesn’t actually do a line search. … Continue reading

Posted in Machine Learning | Comments Off on An Example of the Python SciPy line_search() Function

Why I’m Not a Fan of the Julia Programming Language

The Julia programming language is a relatively new (still in development) general purpose language intended mostly for use with numerical and scientific programming. I tried out Julia v1.0 about two years ago and wasn’t impressed — at that time Julia … Continue reading

Posted in Machine Learning | Comments Off on Why I’m Not a Fan of the Julia Programming Language

Reducing the Size of a Neural Network using Single-Shot Network Pruning at Initialization

Neural networks can be huge. A neural network with millions or billions of weights and biases (“trainable parameters”) can take weeks to train, which would cost a lot of money, and emit a lot of CO2 from the energy consumption. … Continue reading

Posted in Machine Learning | Comments Off on Reducing the Size of a Neural Network using Single-Shot Network Pruning at Initialization

Seven Deep Learning Techniques for Unsupervised Anomaly Detection

The goal of anomaly detection is to examine a set of data to find unusual data items. Three of the main approaches are 1.) rule based techniques, 2.) classification techniques from labeled training data, 3.) unsupervised techniques. Suppose some source … Continue reading

Posted in Machine Learning | Comments Off on Seven Deep Learning Techniques for Unsupervised Anomaly Detection

The Hellinger Distance Between Two Probability Distributions Using Python

A fairly common sub-problem when working with machine learning algorithms is to compute the distance between two probability distributions. For example, suppose distribution P = (0.36, 0.48, 0.16) and Q = (0.33, 0.33, 0.33). What is the difference between P … Continue reading

Posted in Machine Learning | Comments Off on The Hellinger Distance Between Two Probability Distributions Using Python

The Worst Logistic Regression Graph Diagram on the Internet

Argh! I have to post on this topic. Strewn throughout the Internet is a graph that is supposed to explain what logistic regression is and how it works. I’ve seen this graph, and variations of it, for years and it … Continue reading

Posted in Machine Learning | 2 Comments