Category Archives: Scikit

Exploring the scikit Library SpectralClustering Affinity Matrix

I spend 30 minutes each morning before work looking at technical stuff. One morning I decided to dissect the scikit library SpectralCustering module. Spectral clustering is much more complicated than the two most common techniques, k-means clustering and DBSCAN clustering. … Continue reading

Posted in Scikit | Leave a comment

Poisson Regression Using the scikit Library

Poisson regression is a relatively rare form of machine learning where the goal is to predict a single numeric value in situations where the data is approximately Poisson distributed. An example of Poisson data is the number of customers who … Continue reading

Posted in Scikit | Leave a comment

“Regression Using scikit Kernel Ridge Regression” in Visual Studio Magazine

I wrote an article titled “Regression Using scikit Kernel Ridge Regression” in the July 2023 edition of Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2023/07/06/scikit-kernel-ridge-regression.aspx. A regression problem is one where the goal is to predict a single numeric value. For example, … Continue reading

Posted in Scikit | Leave a comment

Kernel Ridge Regression and Gaussian Process Regression Are Almost The Same

I did a deep dive into the underlying mathematics of kernel ridge regression (KRR) and Gaussian process regression (GPR). I’d read that KRR and GPR are essentially equivalent but had never seen a demonstration. So I put one together. KRR … Continue reading

Posted in Machine Learning, Scikit | 2 Comments

Five Reasons Why I Rarely Use Decision Trees for Regression and One Reason Why I Do

Four reasons that I rarely use decision trees for regression are characteristics of trees in general and apply to regression and classification: 1. Decision trees are highly unstable — a small change in the training data creates a completely different … Continue reading

Posted in Scikit | Leave a comment

“Binary Classification Using a scikit Neural Network” in Microsoft Visual Studio Magazine

I wrote an article titled “Binary Classification Using a scikit Neural Network” in the June 2023 edition of Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2023/06/15/scikit-neural-network.aspx. A binary classification problem is one where the goal is to predict the value of a … Continue reading

Posted in Scikit | Leave a comment

Simple Unsupervised Anomaly Detection Using scikit k-Means Clustering

One Sunday morning, for no real reason I can think of, I decided to put together a demo of unsupervised anomaly detection using the scikit library k-means clustering module. The idea is simple. Take some source data and cluster it … Continue reading

Posted in Scikit | Leave a comment

Combining a PyTorch Neural Network Regression Model with a scikit Gaussian Process Model

A few days ago I did a comparison of regression (predict a single numeric value) problems using a scikit Gaussian process regression (GPR) model and a PyTorch neural network (NN). See https://jamesmccaffrey.wordpress.com/2023/06/27/showdown-gaussian-process-regression-vs-neural-network-regression/. I figured I’d take a look a combining … Continue reading

Posted in PyTorch, Scikit | Leave a comment

“Gaussian Naive Bayes Classification Using the scikit Library” in Microsoft Visual Studio Magazine

I wrote an article titled “Gaussian Naive Bayes Classification Using the scikit Library” in the June 2023 edition of Microsoft Visual Studio Magazine. See https://visualstudiomagazine.com/articles/2023/05/31/gaussian-naive-bayes.aspx. Gaussian naive Bayes classification is a classical machine learning technique that can be used to … Continue reading

Posted in Scikit | Leave a comment

Showdown: Gaussian Process Regression vs Neural Network Regression

The goal of a regression problem is to predict predict a single numeric value. For example, you might want to predict the price of a house based on its area in square feet, number of bedrooms, tax rate, and so … Continue reading

Posted in Machine Learning, PyTorch, Scikit | Leave a comment