[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Benchmarking (was Re: [FWP] words words words)

To: fwp@technofile.org
Subject: Benchmarking (was Re: [FWP] words words words)
From: tayers@bridge.com
Date: 19 May 2000 14:26:50 -0500
In-Reply-To: Ronald J Kimball's message of "Fri, 19 May 2000 11:27:07 -0400"
References: <91E486361D4CD311B3140060089A882535CA47@hermes.hyperchip.com> <20000519112707.I569916@linguist.dartmouth.edu>

>>>>> "R" == Ronald J Kimball <rjk@linguist.dartmouth.edu> writes:

[ Ala Qumsieh program deleted but is shown below ]

R> This one is very similar to Tim's, but ends up twice as fast, probably
R> because it avoids grep and m//g.  Nice work.

I was not surprised that Ala's program was faster than mine for
exactly the reasons Ronald states, but for my personal edification I
thought I'd verify his claim of twice as fast and then see if mine
could be sped up by making the appropriate adjustments. My attempts at
benchmarking raised questions about the benchmarking process.

I started by using the Benchmark module. Here's the program I wrote

  use Benchmark;
  open NULL, ">/dev/null" || die "Can't open NULL: $!";

  timethese(100,
   {
    ala => sub {
      open T, "threes" or die $!;
      chomp(my @three = <T>);
      my %three;
      @three{@three} = ();
      close T;
      open N, "nines" or die $!;
      while (<N>) {
  	    exists $three{$1} && exists $three{$2} && exists $three{$3} &&
          print NULL if /(...)(...)(...)/;
      }
    },
    steve => sub {
      open T, "threes";
      chomp(@t = <T>);
      undef $/;
      open N, "nines" ;
      $n = <N>;
      for (@t) {
        $n =~ s/$_(?=(?:...)*\n)/\L$_/g;
      }
      $n =~ s/^[a-z]*[A-Z][A-Za-z]+\n//gm;
      print NULL $n;
    },
    tim => sub {
      open T, "threes";
      chop(@t=<T>);
      @t{@t}=(1)x@t;
      push @ARGV, "nines";
      while (<>) {
        grep{!$t{$_}}/(...)/g or print NULL;
      }
    }
   }
           );
  __END__

I consistently get these results
  Benchmark: timing 100 iterations of ala, steve, tim...
   ala: 48 wallclock secs (47.22 usr +  0.20 sys = 47.42 CPU) @  2.11/s (n=100)
   steve: 18 wallclock secs (16.90 usr +  1.32 sys = 18.22 CPU) @  5.49/s (n=100)
   tim: 51 wallclock secs (50.25 usr +  0.75 sys = 51.00 CPU) @  1.96/s (n=100)

Hmm. When I originally "benchmarked" my version versus Steven's I
created two separate programs and used L<time (n)> which showed that
Steven's version took about 5 seconds and mine took < 1 second. So I
created three programs

  # ala.pl
  open T, "threes" or die $!;
  chomp(my @three = <T>);
  my %three;
  @three{@three} = ();

  open N, "nines" or die $!;
  while (<N>) {
    exists $three{$1} && exists $three{$2} && exists $three{$3} &&
      print if /(...)(...)(...)/;
  }
  __END__

  # steve.pl
  open T, "threes"; 
  chomp(@t = <T>);
  undef $/;
  open N, "nines";
  $n = <N>;
  for(@t) {
    $n =~ s/$_(?=(?:...)*\n)/\L$_/g;
  }
  $n =~ s/^[a-z]*[A-Z][A-Za-z]+\n//gm;
  print $n;
  __END__

  # tim.pl
  #!perl -n
  BEGIN {
    open T, "threes";
    chop(@t=<T>);
    @t{@t}=(1)x@t;
  }
  grep{!$t{$_}}/(...)/g or print;
  __END__

And I got the following results (I ran each program several times to
show there was little variation, but only one iteration is shown.)

  % time perl ala.pl > /dev/null
  0.49user 0.00system 0:00.49elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
  0inputs+0outputs (259major+60minor)pagefaults 0swaps

  % time perl steve.pl > /dev/null
  5.28user 0.96system 0:06.27elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
  0inputs+0outputs (263major+51690minor)pagefaults 0swaps

  % time perl tim.pl nines > /dev/null
  0.83user 0.01system 0:00.88elapsed 95%CPU (0avgtext+0avgdata 0maxresident)k
  0inputs+0outputs (265major+65minor)pagefaults 0swaps

So that reproduces what I originally saw and what Ronald saw. How come
so much difference between the results of Benchmark.pm and time (n)?

This all happened with perl, v5.6.0 built for i686-linux on a Redhat
6.1 box.

Hope you have a very nice day, :-)
Tim Ayers (tayers@bridge.com)

P.S. I apologize for the length of this message, but it all seemed
necessary to pose my question accurately. Thanks a lot! I'm already
improving; my answer was only bested by two-fold instead of ten-fold
or so. ;-)


==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe

Follow-Ups:
- Re: Benchmarking (was Re: [FWP] words words words)
  - From: Ronald J Kimball <rjk@linguist.dartmouth.edu>

References:
- RE: [FWP] words words words
  - From: Ala Qumsieh <aqumsieh@hyperchip.com>
- Re: [FWP] words words words
  - From: Ronald J Kimball <rjk@linguist.dartmouth.edu>

Prev by Date: Re: [FWP] words words words
Next by Date: Re: Benchmarking (was Re: [FWP] words words words)
Prev by thread: Re: [FWP] words words words
Next by thread: Re: Benchmarking (was Re: [FWP] words words words)
Navigation: Date Index | Thread Index | Search | Other lists at bumppo.net