Using Dot Product as a Measure of Similarity

A topic that often pops up in machine learning is using the dot product function as a measure of similarity between two vectors. Briefly, the dot product of two vectors is the sum of the products of their corresponding elements. If v1 = (4, 3, 2) and v2 = (5, -2, 0) then dp(v1, v2) = (4 * 5) + (3 * -2) + (2 * 0) = 14. The dot product can range between -infinity to +infinity. Larger values indicate more similar (unlike a distance function where larger values indicate less similar).

An alternative but mathematically equivalent definition: dp(v1, v2) = len(v1) * len(v2) * cos(angle). It’s not at all obvious that the two definitions give the same result. In machine learning, the sum of products definition is usually more useful.

Example:

v1 = np.array([2.0, 2.0])
v2 = np.array([3.0, 3.0])
dp = np.dot(v1, v2)  # (2 * 3) + (2 * 3)
print(dp)  # 12.0 : large (positive) value means similar

Example

v1 = np.array([2.0, 2.0])
v2 = np.array([3.0, -3.0])
dp = np.dot(v1, v2)  # (2 * 3) + (2 * -3) 
print(dp)  # 0.0 for orthogonal vectors

Example:

v1 = np.array([2.0, 2.0])
v2 = np.array([-3.0, -3.0])
dp = np.dot(v1, v2)  # (2 * -3) + (2 * -3)
print(dp)  # -12.0 : small (negative) value means dissimilar

Details: The dot product is a specific type of “inner product” function. If the dot product of two vectors is 0, the two vectors are orthogonal (perpendicular) — sort of an intermediate similarity. The length of v = (a, b, c) is sqrt(a^2 + b^2 + c^2). If you normalize two vectors by dividing each by its length, the dot product function will range from -1 to +1. Compared to using Euclidean distance, the dot product for similarity is especially useful for vectors with high dimension.

Weirdness: The dot product similarity is not always intuitive, and it definitely does not correspond to the distance between two vectors. If v1 = (2, 2) and v2 = (3, 3), notice that dp(v1, v1) = 8 and dp(v1, v2) = 12 — in other words, the dp-similarity between two identical vectors can be less than the dp-similarity between two completely different vectors.

Bottom line: When I use dot product as a measure of similarity, I try not to over-think it. Larger dp values mean more similar, smaller values mean less similar. Because dp(v1, v2) of two arbitrary vectors (meaning not necessarily having the same lengths) depends on both the length/magnitudes and the directions of v1 and v2, the interpretation of the dp similarity value isn’t as intuitive as a distance measure.



Three book covers by artists that have a similar style (to my eye anyway). The art is accompanied by some great text. Left: By artist Robert McGinnis (b. 1926). “Silver town – a magnet for tough men”. Nice, but would have been better if magnets attract silver. Center: By artist Paul Mann (b. 1955). “Can a monk on Park Avenue save his home from the wrecking ball?” I suspect this is not a question that has been asked very often. Right: By artist Robert Abbett (1926-2015). “When she crashed into his house, about all she wore was a guilty look.” I doubt Shakespeare could have done much better.


Demo code:

# dot_product_demo.py

import numpy as np
import torch as T
device = T.device('cpu')

v1 = np.array([2.0, 2.0])
v2 = np.array([3.0, 3.0])
dp = np.dot(v1, v2)
print(dp)  # 12.0 : large (positive) value means similar

t1 = T.tensor(v1).to(device)
t2 = T.tensor(v2).to(device)
dp = T.dot(t1, t2)
print(dp)  # tensor(12.0) : works for tensors too

v1 = np.array([2.0, 2.0])
v2 = np.array([3.0, -3.0])
dp = np.dot(v1, v2)
print(dp)  # 0.0 for orthogonal vectors

v1 = np.array([2.0, 2.0])
v2 = np.array([-3.0, -3.0])
dp = np.dot(v1, v2)  # (2 * -3) + (2 * -3)
print(dp)  # -12.0 : small (negative) value means dissimilar

# alternative definition of dot product is:
# dp(v1, v2) = len(v1) * len(v2) * cos(angle)
# len(v1) = sqrt(2^2 + 2^2) = sqrt(8) = 2.8284
# len(v2) = sqrt(-3^2 + -3^2) = sqrt(18) = 4.2426
# cos(180 deg) = -1
# len(v1) * len(v2) * cos(180) = -12.0

v1 = v1 / np.linalg.norm(v1)  # normalized
v2 = v2 / np.linalg.norm(v2)
dp = np.dot(v1, v2)
print(dp)  # -1.0 : normalized dp in [-1, +1]
This entry was posted in Machine Learning. Bookmark the permalink.

3 Responses to Using Dot Product as a Measure of Similarity

  1. wholehope says:

    I had the same confusion about the Weirdness of dot product similarity. After some research, I wrote an article try to explain why it makes sense for some use cases. https://www.wwwinsights.com/ai/similarity-metrics-vector-databases/

  2. Very nice, and thorough, explanation.

  3. Subir Sengupta says:

    This article was really useful to understand how to use dot product for similarity.

    I am trying to understand how a query against a dataset would work (like a Vector DB). Would you have to do a dot product similarity comparison against every vector in the dataset? That would be really slow.

Comments are closed.