Category Archives: Machine Learning

Finding Reliable Negatives For Positive and Unlabeled Learning (PUL) Datasets

Suppose you have a machine learning dataset for training, where only a few data items have a positive label (class = 1), but all the other data items are unlabeled and could be either negative (class = 0) or positive. … Continue reading

Posted in Machine Learning, PyTorch | Leave a comment

Computing the Distance Between Two Datasets Using Autoencoded Wasserstein Distance

A fairly common sub-problem in machine learning and data science scenarios is computing the distance(or difference or similarity) between two datasets. At first thought, this sounds easy but in fact the problem is extemely difficult. If you try to compare … Continue reading

Posted in Machine Learning | 1 Comment

Researchers Explore Bayesian Neural Networks on Pure AI

I contributed to an article titled “Researchers Explore Bayesian Neural Networks” on the Pure AI web site. See https://pureai.com/articles/2021/09/07/bayesian-neural-networks.aspx. The agenda of the recently completed 2021 International Conference on Machine Learning (ICML) listed over 30 presentations related to the topic … Continue reading

Posted in Machine Learning | 1 Comment

A Quick Demo of the DBSCAN Clustering Algorithm

I was reading a research paper this morning and the paper used the DBSCAN (“density-based spatial clustering of applications with noise”) clustering algorithm. DBSCAN is somewhat similar to k-means clustering. Both work only with strictly numeric data. In k-means you … Continue reading

Posted in Machine Learning | Leave a comment

Differential Evolution Optimization in Visual Studio Magazine

I wrote an article titled “Differential Evolution Optimization” in the September 2021 edition of the Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2021/09/07/differential-evolution-optimization.aspx. The most common type of optimization for neural network training is some form of stochastic gradient descent (SGD). SGD … Continue reading

Posted in Machine Learning | Leave a comment

Example of Computing Kullback-Leibler Divergence for Continuous Distributions

In this post, I present an example of estimating the Kullback-Leibler (KL) divergence between two continuous distributions using the Monte Carlo technique. Whoa! Just stating the problem has a massive amount of information. The KL divergence is the key part … Continue reading

Posted in Machine Learning, PyTorch | Leave a comment

The Wasserstein Distance Using C#

The Wasserstein distance has many different variations. In its simplest form the Wasserstein distance function measures the distance between two discrete probability distributions For example, if: double[] P = new double[] { 0.6, 0.1, 0.1, 0.1, 0.1 }; double[] Q1 … Continue reading

Posted in Machine Learning | 1 Comment

Another Set of Beautiful Machine Learning Visualizations from Thorsten Kleppe

Thorsten Kleppe is a fellow machine learning enthusiast who creates beautiful ML visualizations. Thorsten sent me some of his latest work. Thorsten’s new visualizations are based on a logistic regression model applied to the MNIST dataset. The MNIST dataset contains … Continue reading

Posted in Machine Learning | 1 Comment

Wasserstein Distance Using C# and Python in Visual Studio Magazine

I wrote an article titled “Wasserstein Distance Using C# and Python” in the August 2021 edition of Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2021/08/16/wasserstein-distance.aspx. There are many different ways to measure the distance between two probability distributions. Some of the most … Continue reading

Posted in Machine Learning | Leave a comment

Comparing Wasserstein Distance with Kullback-Leibler Distance

There are many ways to calculate the distance between two probability distributions. Four of the most common are Kullback-Leibler (KL), Jensen-Shannon (JS), Hellinger (H), and Wasserstein (W). When I was in school, I learned that W was superior to KL, … Continue reading

Posted in Machine Learning | Leave a comment