(Warning: this might be a bit for "fun with math" than fun with perl, but I suspect the fraction of Perl geeks who are also math geeks is large.) Oops! It turns out that the simple shuffle idea that a few people have mentioned on the list isn't actually a random shuffle. In fact, the Cookbook mentions the fisher_yates shuffle as the Right Way to do it. Allow me to quote the salient paragraph (for those few of you who have been too consumed with False Laziness to buy the Cookbook): ---------------------------------------------------------------------- Shuffling is a surprisingly tricky process. It's easy to write a bad shuffle: sub naive_shuffle { # don't do this for (my $i = 0; $i < @_; $i++) { my $j = int rand @_; ($_[$i], $_[$j]) = ($_[$j], $_[$i]); } } This algorithm is biased; the lists' possible permutations don't all have the same probablity of being generated. The proof of this is simple: take the case where we're passed a 3-element list. We generated three random numbers, each of which can have three possible values, yielding 27 possible outcomes here. There are only 6 permutations of the 3-element list, though. Because 27 isn't evenly divisible by 6, some outcomes are more likely than others. The Fisher-Yates shuffle avoids this bias by changing the range of the random numbers it selects. ---------------------------------------------------------------------- (As a reminder, Fisher-Yates is just like the naive shuffle, except you loop *down* instead of up, and take rand($i+1) instead of rand(@_).) I thought about this a bit, and it's pretty cool. It turns out (if you waste some time doing shuffling in your head) that in a three element list, the first element has a 1/3 chance of ending up anywhere. However, the second element & third have chances of 1/3, 8/27 (.37037), and 10/27 (.2963) of landing in various locations. Meanwhile, the Fisher-Yates is algorithmically really cool. If we go back to 52 cards, it works like this: the second card has a 1/52 chance of being switched to position 52 in the first time through the loop. What's the chance of it being switched to position 51? Well, the chance that it wasn't already switched to 52 times the chance that 51 chooses to switch with it, or 51/52 * 1/51 = 1/52. And so on! This got me curious as to whether my Schwartzian method was truly random. So I wrote a program to find out. It does 10^6 shuffles and counts how often a list member (e.g. the second of the three members) shows up in each position. I got: Shuffle Name 0 1 2 -------------------------------------------------- Naive (bad) 0.3712 0.2960 0.3329 Fisher-Yates 0.3335 0.3334 0.3331 Schwartzian 0.3332 0.3332 0.3335 So it *looks* like the schwartzian method *is* random, and the naive method is in fact non-random. When I tried using larger sets, btw, I found that the standard deviation of the naive algorithm's probabilities away from 1/N is about 2-3 times the FY or S algorithms. -Amir ==== Want to unsubscribe from Fun With Perl? Well, if you insist... ==== Send email to <fwp-request@technofile.org> with message _body_ ==== unsubscribe