Shortest Path Graph Analysis using a CLR Stored Procedure

I wrote an article “Shortest-Path Graph Analysis using a CLR Stored Procedure” that appears in the May 2013 issue of Microsoft’s MSDN Magazine. See There are many important uses of shortest path analyses. Imagine you have a huge amount of social network data. This data can be stored in a graph where the nodes are people and the connections between nodes represent associations like “friend-ness” or “has exchanged e-mail”. Shortest path analysis determines the shortest number of hops between any two people. This information could be used to identify people who are highly influential because many paths run through them.

Most traditional shortest path algorithms assume that the graph under analysis can be stored in memory, typically in a matrix. But large graphs (say greater than 100,000 nodes) that cannot fit into memory can be stored in a SQL database. One way to do shortest path analysis of a graph in a database is to write a stored procedure using the SQL language. But SQL is fairly slow and the code is quite unnatural because SQL is primarily set-based (e.g., the Select statement) rather than procedural (e.g., foreach loops).

In a Microsoft technology stack environment, an excellent alternative to shortest path analysis using the SQL language is to write a stored procedure using the C# language. Such stored procedures are called CLR (Common Language Runtime) procedures. CLR stored procedures are more complex than standard SQL stored procedures, but have much better performance.



This entry was posted in Software Test Automation. Bookmark the permalink.