[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

[FWP] Unsort/disperse a list

To: fwp@technofile.org
Subject: [FWP] Unsort/disperse a list
From: Tim Ayers <tayers@bridge.com>
Date: 10 Jan 2001 16:02:56 -0600
Delivered-To: irons-bumppo:net-mpfwp@bumppo.net

I've written a function that evenly disperses a list. For example, a
list like (A A B B C C) comes out as (A B C A B C). While the
functionality is not commonly useful, I thought I would pass it along
for critiques, suggestions, and improvements. It doesn't work well
when one of the groups has the majority of elements, e.g. (A B C C C C
C C C D), but otherwise it seems to do pretty well.

Before I wrote this I searched around and couldn't find anything
similar; only shuffle algorithms. I'm probably just using the wrong
terminology. Are there well-known algorithms to do this?

Hope you have a very nice day, :-)
Tim Ayers

P.S. It's not an entirely stable unsort.

--------------------------- disperse.pl ----------------------------------
#!perl -w

use strict;

sub disperse {
  my ($lst, $key_func) = @_;

  # the key function is optional. The default is to use the value as is.
  $key_func ||= sub { return shift; };

  my %buckets = ();
  # Divide everything into their respective "key-equal" buckets.
  my $total_els = 0;
  foreach my $val (@$lst) {
    my $k = $key_func->($val);
    $buckets{$k} = {freq=>undef, next_idx=>0, els=>[]}
      unless exists $buckets{$k};
    push(@{$buckets{$k}->{els}}, $val);
    $total_els++;
  }

  # Calculate the frequency of each bucket. The frequency is how often we
  # expect an element from this to appear in the dispersed list.
  $buckets{$_}->{freq} = $total_els/@{$buckets{$_}->{els}}
    for sort keys %buckets;

  # Now that we have everything set up, we really want the buckets in an
  # ordered list, because as we add elements we want to look at the buckets
  # in order of ascending 'freq'.
  my @buckets = sort {$a->{freq} <=> $b->{freq}} values %buckets;

  my @disp_lis = ();
  my $cur_idx = 0;
  # loop until we have added all the elements to the new list.

  # The next_idx field of the bucket keeps track of what index into @disp_lis
  # we expect the next element from the bucket to appear. Obviously this is
  # based on the frequency and the most recent index that has an element from
  # the bucket.
  while (_still_data(\@buckets)) {
    foreach my $b (@buckets) {
      next unless @{$b->{els}};
      if ($b->{next_idx} <= $cur_idx) {
        push(@disp_lis, shift(@{$b->{els}}));
        $b->{next_idx} = $cur_idx + $b->{freq};
        last;
      }
    }
    $cur_idx++;
  }
  return @disp_lis;
}

# Brute force function to check whether there are any more elements in
# any of the buckets.
sub _still_data {
  @{$_->{els}} && return 1 for (@{$_[0]});
}

while (<DATA>) {
  chop;
  my @lis = split /\s+/;
  my @dlis = disperse(\@lis, sub {my($k)=(pop=~/^(.)/);$k});
  print STDERR "@lis\n@dlis\n\n";
}

__DATA__
A1 A2 A3 B1 B2 B3 C1 C2 C3 D1 D2 D3
A1 A2 A3 B1 C1 C2 C3 C4 C5 C6 C7 C8 D1 D2
HA H2 H3 H4 H5 H6 H7 H8 H9 HJ HQ HK SA S2 S3 S4 S5 S6 S7 S8 S9 SJ SQ SK DA D2 D3 D4 D5 D6 D7 D8 D9 DJ DQ DK CA C2 C3 C4 C5 C6 C7 C8 C9 CJ CQ CK


==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe

Follow-Ups:
- Re: [FWP] Unsort/disperse a list
  - From: John Porter <jdporter@min.net>

Prev by Date: Re: [FWP] Fun with pack()
Next by Date: [FWP] Re: Fun with pack()
Prev by thread: [FWP] Re: Fun with pack()
Next by thread: Re: [FWP] Unsort/disperse a list
Navigation: Date Index | Thread Index | Search | Other lists at bumppo.net