R Language Vector Shuffling

A somewhat common task in machine learning programming is shuffling the order of values in a vector. In the R language, there are several ways to do this but the two main options are to write a user-defined function that uses the Fisher-Yates mini-algorithm, or to use the built-in sample() function.


The built-in sample() function can be used in two ways. If sample() is passed a vector, the return result will be a new vector with the order scrambled. If sample() is passed an integer n, the result will be a vector that contains values 1 to n, in scrambled order.

The Fisher-Yates algorithm is very short but quite tricky. A small implementation mistake can give you a function that seems to work correctly but in fact does not generate all possible results with equal probability.

For either approach, to get reproducible results you can call the set.seed() function.

This entry was posted in R Language. Bookmark the permalink.