Selecting the K Largest Items in an Array

Suppose you have an array of integers (or objects which can be ordered). Some closely related problems are 1.) sort the array, 2.) find the kth largest item, 3.) find the k largest items (in order), and 4.) find the k largest items (not necessarily in order). Wikipedia has good discussions of these problems and so I decided to code up one of the Wikipedia algorithms which finds the k largest items in an array (not necessarily in order). The basic idea is to use a modified form of quicksort. The heart of quicksort is a function called partition which accepts a so-called pivot index and a pivot value (the value at location pivot index), and then places all values in the array which are less than the pivot value to the left of the pivot value, and places all values which are greater than the pivot value to the right of the pivot value. With a partition function in hand you can find the k largest items in an array by recursively repeatedly calling partition until the pivot index equals k. The algorithm, while short, is actually quite tricky, which is one of the reasons I coded it up — to make sure I understood exactly what was going on. The two key functions are:
 
static int partition(int[] array, int left, int right, int pivotIndex)
{
  int pivotValue = array[pivotIndex];
  int temp = array[pivotIndex];
  array[pivotIndex] = array[right];
  array[right] = temp;
  int storeIndex = left;
  for (int i = left; i <= right – 1; ++i )
  {
    if (array[i] <= pivotValue)
    {
      temp = array[i];
      array[i] = array[storeIndex];
      array[storeIndex] = temp;
      ++storeIndex;
    }
  }
  temp = array[storeIndex];
  array[storeIndex] = array[right];
  array[right] = temp;
  return storeIndex;
}
 
static void findFirstK(int[] array, int left, int right, int k)
{
  int pivotNewIndex = 0;
  if (right > left)
  {
    int pivotIndex = (left + right) / 2;
    pivotNewIndex = partition(array, left, right, pivotIndex);
    if (pivotNewIndex > k)
      findFirstK(array, left, pivotNewIndex – 1, k);
    else if (pivotNewIndex < k)
      findFirstK(array, pivotNewIndex + 1, right, k);
  }
}
 
See the screenshot below. Notice the 5 largest values have been placed in the 5 left-most cells of the array but are not sorted.
 
This entry was posted in Software Test Automation. Bookmark the permalink.