I was sitting at home one weekend, waiting for the rain to stop so I could walk my dogs. I have spoken at the PyData conference before, but I’m not speaking this year. I was scanning through the conference agenda and noticed that one of the talks is about the FLAML library (“Fast and Lightweight Auto ML”) for automatic ML. See https://microsoft.github.io/FLAML/. I figured I’d investigate.
There are many different AutoML libraries. I think of AutoML as an extension of automated hyperparameter search. In hyperparameter search, you set a machine learning model, such as kernel ridge regression, that suits the prediction problem at hand. Then a hyperparameter search program performs an automated examination to find the best values for the model’s parameters.
Installing FLAML worked without problems.
In AutoML, you don’t specify the model. You let the AutoML system try different models and different sets of relevant hyperparameters.
Installing FLAML was easy; just “pip install flaml”. That was nice. Trust me, setting up a library that installs using pip is no small task.
The FLAML documentation was quite good. The documentation has code examples instead of blah, blah, blah about architecture.
I used the FLAML documentation regression example as a template to tackle one of my standard regression problems. The goal is to predict a person’s income (divided by $100,000) from sex (male = 0, female = 1), age (divided by 100), State (Michigan = 100, Nebraska = 010, Oklahoma = 001), and political leaning (conservative = 100, moderate = 010, liberal = 001).
FLAML is impressive, but I’m just not a fan of AutoML systems for the type of work I do.
My experiment demo worked the first time. Nice.
However, even though I was impressed with FLAML, I am not a fan of AutoML systems in general. In my opinion, AutoML systems operate at too high of a level of abstraction. Machine learning is very tricky and leaving it to black box systems just doesn’t feel right to me, especially for the type of ML work I do.
Robots and androids are the epitome of human automation. Left: In “Metropolis” (1927), a robot named Maria is created by an evil scientist. A pretty good movie if you’re a fan of science fiction movie history. I give the movie a B grade. Center: In “Futureworld” (1976), there are many different kinds of robots including those for companionship. I liked the part where it is revealed that robots are designing new robots that are designing new robots. I give the movie a C grade. Right: In “Surrogates” (2009), almost all people sit in pods and live their lives through robots. I give the movie a B grade.
Demo code below. You can find the data at https://jamesmccaffrey.wordpress.com/2022/10/10/regression-people-income-using-pytorch-1-12-on-windows-10-11/.
# people_income_flaml.py # # Anaconda3-2022.10 Python 3.9.13 # Windows 10/11 # FLAML 1.2.0 # predict income from sex, age, State, politics import numpy as np from flaml import AutoML # 0. prepare print("\nPredict income using FLAML AutoML ") np.random.seed(1) # 1. load data print("\nLoading data into memory ") train_file = ".\\Data\\people_train.txt" train_xy = np.loadtxt(train_file, delimiter="\t", usecols=[0,1,2,3,4,5,6,7,8], comments="#", dtype=np.float32) train_X = train_xy[:,[0,1,2,3,4,6,7,8]] train_y = train_xy[:,5].flatten() # 1D required # 2. Initialize an AutoML instance print("\nCreating a FLAML object ") automl = AutoML() automl_settings = { "time_budget": 1, # in seconds "metric": 'r2', "task": 'regression', "log_file_name": "people_income.log" } # 3. find and train model print("\nFinding best model ") automl.fit(X_train=train_X, y_train=train_y, **automl_settings) print("Done ") # 4. show best model found best_model = automl.model.estimator print("\nBest model: ") print(best_model) # 5. use model print("\nPredict income for Male, 24, Michigan, liberal ") X = np.array([[1, 0.24, 1,0,0, 0,0,1]], dtype=np.float32) pred_inc = automl.predict(X) print(pred_inc) print("\nEnd FLAML demo ")
You must be logged in to post a comment.