## SQL Graph Shortest Path, Part I

An interesting problem I looked at lately is the problem of determining the shortest path between two nodes (aka vertexes) of a graph whose representation is stored in SQL tables. This turns out to be a moderately difficult problem. Suppose I have a conceptual graph as shown in the image below. There are 6 nodes. Node 0 has a directed connection to node 1, with weight 10. The connection from node 1 to node 0 has weight 18, and so on. Note that there is a connection from node 1 to node 4 (weight 12) but that there is no connection from node 4 to node 1. The first issue is to decide on a way to represent the graph as one or more SQL tables. This in itself is not a trivial problem, but at any rate I used the following SQL statements:

use master
go

if exists(select name from sys.sysdatabases where name=’dbDemoGraph’)
drop database dbDemoGraph
go

create database dbDemoGraph
go

use dbDemoGraph
go

create table tblNodes
(
nodeid int primary key,
userid bigint not null unique
)
go

create index idxTblNodesByColUserid on tblNodes(userid)
go

insert into tblNodes values(0,000000000) — node id, userid
insert into tblNodes values(1,111111111)
insert into tblNodes values(2,222222222)
insert into tblNodes values(3,333333333)
insert into tblNodes values(4,444444444)
insert into tblNodes values(5,555555555)
insert into tblNodes values(6,666666666)
select * from tblNodes order by nodeid
go

create table tblEdges
(
fromNode int not null, — a nodeid
toNode int not null, — a nodeid
edgeWeight int not null — some arbitrary measure
)
go

insert into tblEdges values(0,1,10) — from node 0 to node 1, weight 10
insert into tblEdges values(1,0,18) — from node 1 to node 0, weight 18
insert into tblEdges values(1,2,30) — from 1 to 2, weight 11
insert into tblEdges values(2,1,24) — etc.
insert into tblEdges values(1,3,20)
insert into tblEdges values(3,1,20)
insert into tblEdges values(1,4,12)
— cannot go from 4 to 1
— cannot go from 2 to 4
insert into tblEdges values(4,2,14)
insert into tblEdges values(2,5,16)
insert into tblEdges values(5,2,26)
go

I use two tables. The first table, tblNodes, has a 0-based node ID like the ones shown in the image. There is also a "User ID" value, which is a way to map the general graph to some specific data. In this situation I am imagining that each node represents a user of some sort. The second table, tblEdges, stores the relationships between nodes, including edge weights which could represent something like the number of times a user sends a text message to another user.
Next time I’ll describe a SQL stored procedure which accepts a start node ID, and an end node ID, and which determines the shortest path (based on edge weights) between the two.

Advertisements
This entry was posted in Software Test Automation. Bookmark the permalink.