The Intersection and Union of Two C# Lists

Recently I was doing some C# coding and needed efficient functions that return the intersection of two List collections (the items that are in both Lists) and the union of two List collections (items that are in either List or both Lists). This isn’t a particularly difficult problem but what really surprised me was the large number of different ways to approach the problem. Here is a simple but effective implementation of the Intersection method (without error-checking; and the annoying blog software may have stripped away some > characters):

static List<int> ListIntersection(List<int> A, List<int> B)
  List<int> result = new List<>();
  // make a Dictionary of the larger; fewer Adds)
  Dictionary<int, bool> d = new Dictionary<int, bool>();
  if (B.Count < A.Count) {
    for (int i = 0; i < A.Count; ++i)
      d.Add(A[i], true);

    for (int i = 0; i < B.Count; ++i)
      if (d.ContainsKey(B[i]) == true)
  else {
    for (int i = 0; i < B.Count; ++i)
      d.Add(B[i], true);

    for (int i = 0; i < A.Count; ++i)
      if (d.ContainsKey(A[i]) == true)
  return result;

The idea is to create a Dictionary of items in the larger List, so that lookups will be very fast, then walk through the smaller List, adding items to the result List that are in the Dictionary (and therefore are in both Lists). We want to walk through the smaller List to minimize the number of Add operations. Notice that we only care about the Dictionary keys, so I use a dummy bool value.

(Note: in my original post, I blundered and incorrectly called this the union function. Thanks to Matthew van Eerde and ‘polkduran’ for pointing out this mistake.)

This entry was posted in Software Test Automation. Bookmark the permalink.

4 Responses to The Intersection and Union of Two C# Lists

  1. mvaneerde says:

    To be pedantic this is the intersection of the two lists.

  2. polkduran says:

    Hi, I would make the same remark as mvaneerde.
    Also, I think it would be more elegant to use a HashSet than a Dictionary so you would not be forced to add a dummy value to your dictionary, I think a HasSet is as efficient as a Dictionary.
    LINQ provide extension methods to do this kind of operations (Intersect, Union ) but it is always interesting to know how to do it without.

    • ACK! I hate it when I make a dumb mistake. Yes, this is an intersection, not a union (items that are in either list or both). I actually created several list functions, including a union function, and had it on my brain.

      Normally when I make mistakes, they’re usually minor, and I don’t change the original post because I think a lot of times errors contain useful information. But in this case I’m going to edit the post because I don’t want some poor CS student finding this code and getting confused.

    • I was thinking about commenting on the LINQ way to do Intersection but didn’t. But since you mention it — I am not a fan of many parts of LINQ. With the work I do, LINQ often has significantly worse performance than raw C# implementations.

      As far as using a HashSet goes, I agree with you that, objectively HashSet probably makes more sense than a Dictionary. I ran some performance tests and found no difference between HashSet and Dictionary. For me, I just prefer using a Dictionary — not really sure why, it just suits me for some reason.

Comments are closed.