The Maximum Clique Problem

I’ve been looking at the graph maximum clique problem recently. The goal of the maximum clique problem is to find the largest collection of nodes (also called vertices) in a graph which are all connected (adjacent) to each other. In the graph at the bottom of this post, the maximum clique is the node set {0, 1, 3, 4} which has size four. The maximum clique problem appears in many important applications such as social network communication analysis where graph nodes are people and edges are messages or relationships of some kind.

It turns out that finding the maximum clique for graphs of even moderate size is one of the most challenging problems in computer science. The problem is NP-complete which means, roughly, that every possible answer must be examined. Suppose we have a graph with six nodes. First we’d try to see if all six nodes form a clique. There is Choose(6,6) = 1 way to do this. Next we’d examine all groups of five nodes at a time; Choose(6,5) = 6 ways. And so on, checking Choose(6,4) = 15, Choose(6,3) = 20, Choose(6,2) = 15, and Choose(6,1) = 6 possible solutions for a total of 63 checks. (For the maximum clique problem we can stop when we find the largest clique so let’s assume that on average we’d have to go through about one-half of the checks).

The total number of checks increases very quickly as the size of the graph, n, increases. For n = 10 there are 1,023 total combinations. For n = 20 there are 1,048,575 combinations. But for n = 1,000 there are







combinations. Even if you could perform one trillion checks per second it would take you 3.4 x 10^281 years which is insanely longer than the estimated age of the universe (about 1.0 x 10^10 = 14 billion years). Anyway I’m working on an advanced heuristic to estimate the maximum clique size for huge graphs and I’m making good progress.

This entry was posted in Software Test Automation. Bookmark the permalink.