[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

[FWP] Multiple, unique, random




Here are a few nice algorithms from recent posts to comp.lang.perl.mics. 
Both are based on the Fisher-Yates shuffle, mentioned in perlfaq4 among
other places, though the resemblance may not be quite obvious. 

Let's start with integers.  This one picks a number of random non-negative
integers below a given limit so that they're guaranteed to be unique and
uniformly distributed.  Of course, you'd better not ask for more ints than
there are in the range.  The comments below are from a clpm discussion
between myself and Abigail.

  sub unique_random_ints {
      my ($n, $lim) = @_;
      my (@res, %buf);
      while (@res < $n) {
          my $rn = int rand $lim--;
          push @res, $buf{$rn} || $rn;
          $buf{$rn} = $buf{$lim} || $lim;
      }
      return @res;
  }

: The algorithm I presented is good for picking a sequence of unique,
: uniformly distributed elements from a finite set, even if the set is
: too large to be held in memory, in time proportional to the number of
: elements picked.  (That's assuming constant time hash access, which
: isn't really possible but is a reasonable first approximation.)
:
: It's not good for non-uniform selectors, and the problem it solves
: could be considered meaningless for (conceptually) infinite domains.
: It also requires prior knowledge of the size of the domain.


Now if that wasn't good enough, how about extending the random line
algorithm from perlfaq5 to pick multiple unique lines, still in O(n)?

  my @lines = (undef) x $n;
  my $last;
  while (<>) {
      my $used = 0;
      for my $i (0 .. $#lines) {
          ($last, $lines[$i]) = ($lines[$i], $used++ ? $last : $_)
              if rand($.-$i) < 1;
      }
  }

That _should_ yield a uniform distribution, but I haven't quite managed to
come up with a rigorous proof.  Even if it won't, it'll still be close.

-- 
Ilmari Karonen - http://www.sci.fi/~iltzu/
"And I think I've probably failed to express the thing I'm trying to say, but
 I'll send this version, and when people object to it I might figure out words
 I really needed to use to say it."  -- Lucy Kemnitzer in r.a.sf.c



==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe