Selecting a Random Subset of a Text File

When working with text files, a common task (in my world anyway) is selecting a random subset of a file. For example if a text file consisted of the eight test case lines:


you might want to select three random lines. A call to code to do this could look like:

string inputFile = “testCases.txt”:
string outputFile = “testCases-3.txt”;
SelectRandomLines(inputFile, 3, outputFile);

and get:


There are many ways to do this. A simple approach is to first scan the input file to determine how many line there are. Put each line number into an array in order. Shuffle the array values. Select a subset of the shuffled lines numbers into an array. Sort the subset array. Walk through the input file and if the current line number matches the line number index in the subset array, write the current line to output. Here’s an implementation in C# with error-checking removed.

static void SelectRandomLines(string inputFile,
  int numberLinesToSelect, string outputFile)
  // scan input file
  FileStream ifs = new FileStream(inputFile, FileMode.Open);
  StreamReader sr = new StreamReader(ifs);
  int ct = 0;
  while ((sr.ReadLine()) != null) { ++ct; }
  sr.Close(); ifs.Close();

  int[] lineNumbers = new int[ct];
  for (int i = 0; i < lineNumbers.Length; ++i) {
    lineNumbers[i] = i;

  // Fisher-Yates shuffle
  Random objRan = new Random(0);
  for (int i = 0; i < lineNumbers.Length; i++) {
    int r = objRan.Next(i, lineNumbers.Length);
    int temp = lineNumbers[r];
    lineNumbers[r] = lineNumbers[i];
    lineNumbers[i] = temp;
  // select top n line numbers
  int[] linesToSelect = new int[numberLinesToSelect];
  for (int i = 0; i < linesToSelect.Length; ++i) {
    linesToSelect[i] = lineNumbers[i];

  // sort

  // select lines
  ifs = new FileStream(inputFile, FileMode.Open);
  sr = new StreamReader(ifs);
  FileStream ofs = new FileStream(outputFile, FileMode.Create);
  StreamWriter sw = new StreamWriter(ofs);
  string line = “”;
  int ptrIntoArray = 0;
  int lineCounter = 0;

  while ((line = sr.ReadLine()) != null &&
    ptrIntoArray < linesToSelect.Length)
    if (lineCounter == linesToSelect[ptrIntoArray]) {

  sw.Close(); ofs.Close();
  sr.Close(); ifs.Close();
} // SelectRandomLines()

This code is pretty crude but it comes in handy for me.

This entry was posted in Software Test Automation. Bookmark the permalink.