Computing the Distance Between Two Zip Codes

Computing the distance between two zip codes is easy. And it’s difficult.

To compute the distance between two zip codes, you can find the latitude and longitude of each zip code, then compute the distance between the two lat-lon points.

The main difficulty is that a zip code can cover a very large geographical area and so the lat-lon location of a zip code is very fuzzy.

I coded up a short demo using Python. First, I went to the U.S. Postal Service web site, looking for a database of zip codes. As is often the case with government web sites, the site was absolutely horrible. And no database was available.

Because of this, there are several companies that sell USPS data. I found a nice commercial site at where there was a free version to download. The data was an Excel spreadsheet, which I saved as a tab-delimited text file. There were 15 columns, but I only needed the zip code in column [0] and the lat and lon in columns [12] and [13].

To compute the distance between a pair of lat-lon values, I used the haversine formula, also called the great-circle distance, which takes the curvature of the Earth into account.

Good fun!

When I was a young man, I worked as an Assistant Cruise Director on ships of the Royal Viking Line. I traveled thousands of miles (as measured by haversine or any other formula).

Left: There were two assistant directors on each voyage. One of our jobs was to manage the entertainment. Here I am on one cruise with the other assistant, Peter, and two of the entertainers. They were very talented. Center: Royal Viking had cruises in all part of the world and so I got to see over 40 countries. When the ship would dock at a port, the cruise staff were expected to go ashore to keep an eye on the passengers. Egypt was a highlight. Right: On every trip, one night was the formal Captain’s Dinner. I’m here with the captain and his wife on one such evening. From my tan, I’m pretty sure the photo is from a Mediterranean cruise. From the amount of hair on my head, I’m pretty sure this is a very old photo.

# data is free version of product from:

import numpy as np

# ---------------------------------------------------------

def haversine(lat1, lon1, lat2, lon2):
  # from
  lat1 = np.radians(lat1); lon1 = np.radians(lon1)
  lat2 = np.radians(lat2); lon2 = np.radians(lon2)

  dlon = lon2 - lon1 
  dlat = lat2 - lat1 
  a = np.sin(dlat/2.0)**2 + np.cos(lat1) * \
    np.cos(lat2) * np.sin(dlon/2.0)**2
  c = 2.0 * np.arcsin(np.sqrt(a)) 
  r = 6371.0         # approx. radius Earh in km
  return c * r

# ---------------------------------------------------------

print("\nApproximate distance between two zip codes \n")

fin = open(".\\zip_code_database.txt", "r")
table = dict()
fin.readline()    # consume header
for line in fin:  # load lookup table
  tokens = line.split('\t')
  zc = tokens[0]
  lat = np.float32(tokens[12])
  lon = np.float32(tokens[13])
  table[zc] = (lat,lon)

zc1 = "98029"
zc2 = "98052"
(lat1,lon1) = table[zc1]
(lat2,lon2) = table[zc2]
dist = haversine(lat1,lon1, lat2,lon2)

print("Distance between " + zc1 + " and " + zc2 + " is: ")
print("%0.2f km" % dist)

print("\nEnd demo ")
This entry was posted in Miscellaneous. Bookmark the permalink.

1 Response to Computing the Distance Between Two Zip Codes

  1. saurabh dasgupta says:

    Nice work. I was not aware about this algorithm. Thank you

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s