“Researchers Suggest a Physics of AI Approach to Study Deep Learning Systems Like ChatGPT and Copilot” on the Pure AI Web Site

I contributed to an article titled “Researchers Suggest a ‘Physics of AI’ Approach to Study Deep Learning Systems Like ChatGPT and Copilot” on the March 2023 edition of the Pure AI web site. See https://pureai.com/articles/2023/03/01/physics-of-ai.aspx.

AI assistant applications such as the ChatGPT general information chatbot and the GitHub Copilot computer programming aide have entered mainstream usage. The reality is that exactly how these fantastically complex AI systems work is not fully understood. This is not good.

Researchers at Microsoft have suggested that the study of deep learning models such as GPT-3 and its associated applications should use an approach they call “the physics of AI.” The two key principles of the physics of AI paradigm are:

1.) Explore deep learning phenomena through relatively simple controlled experiments.
2.) Build theories based on simple math models that aren’t necessarily fully rigorous.

So, the physics of AI means to use experimental techniques that physicists have used since the 18th century, rather than applying the core concepts of physics, such as gravity and electromagnetism.

These graphs from the article help explain what pre-training and fine-tuning are

The article describes a small, manageable, synthetic reasoning task, called Learning Equality and Group Operations. Suppose there are six people in a room and you tell a friend the following six facts:

Evan is the same sex as Bret.
Drew is the opposite sex of Finn.
Alex is the same sex as Drew.
Cris is male.
Bret is the opposite sex of Cris.
Finn is the same sex as Evan.

The goal is to figure out the sex/gender (male = -1 or female = +1) of each of the six people in the room. The problem is relatively easy for a human to solve but surprisingly difficult for a computer.

Expressed as a sequence of mathematical symbols, the input sentence is:

e = +b; d = -f; a = +d; c = 1; b = -c; f = +e.

I am quoted in the article:

McCaffrey agrees with the physics of AI principle for exploring deep neural systems. “Transformer architecture AI systems are by far the most complex systems I’ve ever worked with,” he said. “Understanding these systems cannot be accomplished by trying to analyze intractable systems like the GPT-3 based ChatGPT application.”

McCaffrey added, “It’s not clear to me, or to any of the colleagues I’ve spoken with, how quickly the field to understanding deep neural models will advance. But we all agree that it’s crucial to understand how these increasingly common AI systems work.”

The “physics of AI” research paradigm refers to techniques from experimental physics rather than physics phenomena like gravity and electromagnetism. Here are three photos of family beach trips that don’t take child gravity into account.

This entry was posted in Machine Learning. Bookmark the permalink.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s