Take a look at the screenshot below. I was working on a problem to analyze the shortest path between nodes in a huge graph and ended up looking at some really fascinating code. One of the most famous algorithms in computer science is Dijksta’s Shortest Path algorithm. The Wikipedia entry on the topic is pretty good for the most part but has a couple of huge omissions of explanation. The algorithm has these steps
. . . 7 Q := the set of all nodes in Graph ; // add all nodes 8 while Q is not empty: // main loop 9 u := vertex in Q with smallest distance in dist ; 10 if dist[u] = infinity: 11 break ; 12 end if ; 13 remove u from Q ; 14 for each neighbor v of u: . . . 19 decrease-key v in Q ; // Reorder v in the Queue . . .
The algorithm implies that Q is a queue. But notice that on line 9, you don’t necessarily remove the front item, so you must scan the entire Q to find an item with smallest distance value. So in this algorithm Q is actually a list of some sort. This would be horrible if the graph is huge and you’re doing multiple calls to shortest path. You can avoid the linear search if you make Q a priority queue ordered by the distance value (so the distance value is the priority). With a priority queue the smallest distance will be always be at the front. OK, but then on step 19 a value in the priority queue must be changed and so the priority queue would have to be reordered. This would be another linear search to find the item to change so you’re back where you started with regards to performance. You can get around this gotcha by making Q a random access priority queue, that is, a priority queue (where the front item always is the one with the smallest distance) but one where you can instantly look up the location within the queue of an item so you can change the priority (and then you’d have to reorder the queue but this is O(lg n) as long as you are using a binary heap for the underlying storage). To create a random access priority queue you can augment the priority queue data store (a binary heap) with a dictionary collection that holds the indexes in the heap of location of all items. Notice that this in turn implies that the items in the random access priority queue all have a unique ID of some sort. In the case of Dijkstra’s shortest path algorithm this unique ID would be a graph node ID. Yikes! Anyway, I coded up a random access priority queue for Dijkstra’s shortest path and it was extremely challenging and fun. Testing this beast was a fascinating problem in its own right. As if this isn’t enough, I hard-coded the random access priority queue assuming details about the graph (like the node ID is a long named nodeID and the distance is a double named distance). To make the random access priority queue generic so that it can operate on graphs defined differently opens up a whole new can of worms that lead to multiple interfaces! All in all, a really interesting problem. I’ll post the code sometime if I get the chance.
Here’s the non-generic version of the code (minus some WriteLine statements and details) that generated the screenshot below:
Console.WriteLine("Creating random access priority queue"); PriorityQueue pq = new PriorityQueue(); Console.WriteLine("\nAdding nodeID = 222 with distance = 2.0"); pq.Enqueue(new NodeAndDistance(222, 2.0)); pq.Enqueue(new NodeAndDistance(111, 1.0)); pq.Enqueue(new NodeAndDistance(444, 4.0)); pq.Enqueue(new NodeAndDistance(333, 3.0)); Console.WriteLine("Queue is:"); Console.WriteLine(pq.ToString()); Console.WriteLine("Dequeueing front item"); pq.Dequeue(); Console.WriteLine(pq.ToString()); Console.WriteLine("Changing distance of 333 to 9.0"); pq.SetDistance(333, 9.0); Console.WriteLine(pq.ToString());