## SQL Graph Shortest Path, Part I

An interesting problem I looked at lately is the problem of determining the shortest path between two nodes (aka vertexes) of a graph whose representation is stored in SQL tables. This turns out to be a moderately difficult problem. Suppose I have a conceptual graph as shown in the image below. There are 6 nodes. Node 0 has a directed connection to node 1, with weight 10. The connection from node 1 to node 0 has weight 18, and so on. Note that there is a connection from node 1 to node 4 (weight 12) but that there is no connection from node 4 to node 1. The first issue is to decide on a way to represent the graph as one or more SQL tables. This in itself is not a trivial problem, but at any rate I used the following SQL statements:

use master
go

if exists(select name from sys.sysdatabases where name=’dbDemoGraph’)
drop database dbDemoGraph
go

create database dbDemoGraph
go

use dbDemoGraph
go

create table tblNodes
(
nodeid int primary key,
userid bigint not null unique
)
go

create index idxTblNodesByColUserid on tblNodes(userid)
go

insert into tblNodes values(0,000000000) — node id, userid
insert into tblNodes values(1,111111111)
insert into tblNodes values(2,222222222)
insert into tblNodes values(3,333333333)
insert into tblNodes values(4,444444444)
insert into tblNodes values(5,555555555)
insert into tblNodes values(6,666666666)
select * from tblNodes order by nodeid
go

create table tblEdges
(
fromNode int not null, — a nodeid
toNode int not null, — a nodeid
edgeWeight int not null — some arbitrary measure
)
go

insert into tblEdges values(0,1,10) — from node 0 to node 1, weight 10
insert into tblEdges values(1,0,18) — from node 1 to node 0, weight 18
insert into tblEdges values(1,2,30) — from 1 to 2, weight 11
insert into tblEdges values(2,1,24) — etc.
insert into tblEdges values(1,3,20)
insert into tblEdges values(3,1,20)
insert into tblEdges values(1,4,12)
— cannot go from 4 to 1
— cannot go from 2 to 4
insert into tblEdges values(4,2,14)
insert into tblEdges values(2,5,16)
insert into tblEdges values(5,2,26)
go

I use two tables. The first table, tblNodes, has a 0-based node ID like the ones shown in the image. There is also a "User ID" value, which is a way to map the general graph to some specific data. In this situation I am imagining that each node represents a user of some sort. The second table, tblEdges, stores the relationships between nodes, including edge weights which could represent something like the number of times a user sends a text message to another user.
Next time I’ll describe a SQL stored procedure which accepts a start node ID, and an end node ID, and which determines the shortest path (based on edge weights) between the two.