[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

RE: [FWP] Words made from letters in a given phrase



> From: Ronald J Kimball [mailto:rjk@linguist.dartmouth.edu]
> Sent: Wednesday, August 04, 1999 14:12
> To: Eli Evans
> Cc: fwp@technofile.org
> Subject: Re: [FWP] Words made from letters in a given phrase
...
> I think my script will be hard to beat for efficiency.
>
> Ronald

How can I resist a challenge like that?

I replaced the likely biggest bottleneck, the first regex match, by an
eval'ed 'tr' operation.  The cost of the eval is amortized over the
whole run, while the 'tr' should be faster than the regex match.

In benchmarks of just those two on one data set, it seemed perhaps
one-third faster.  Please run it in your benchmark and see (assuming I
have the logic correct, of course).  The benchmark code is at the end of
this letter.

> #!/usr/local/bin/perl
>
> my $letters = shift;
>
> my %letters;
>
> $letters = lc $letters;              # convert to lowercase
> $letters =~ tr/a-z//cd;              # strip non-letter characters

my $trsub = eval "sub { \$word =~ tr/$letters\n//c }"; # ADDED
$@ and die $@;                                         # ADDED

>
> foreach (split(//, $letters)) {      # store letter counts
>     $letters{$_}++;
> }
>
> my($word);
> WORD:
> while (defined($word = <>)) {
>                                      # for each word in list

      next WORD if &$trsub;  # INSTEAD OF THE REGEX MATCH BELOW

>     chomp($word);
>
>     next WORD if ($word =~ /[^$letters]/o);
>                                      # verify letters used

# You could add the newline to the regex as I have to the 'tr', and do
the chomp after the match.

>     my %word;
>     foreach (split(//, $word)) {     # verify letter counts
>         $word{$_}++;
>         next WORD if ($word{$_} > $letters{$_});
>     }
>
>     print "$word\n";                 # success - print word
>
> } # WORD: while (defined($word = <DICT>))
>
>
> __END__


#!/usr/local/bin/perl -w
use strict;
use Benchmark;

my $letters = 'foobar';
my $word = "quuxquux\n";
my $regex = sub { chomp $word; $word =~ /[^$letters]/o };
my $trsub = eval "sub { \$word =~ tr/$letters\n//c }";
$@ and die $@;

timethese(1 << (shift || 0), {
  Regex => $regex,
  Tr    => $trsub,
});
__END__


It seems like fun, anyhow!

--
Larry Rosler
Hewlett-Packard Laboratories
http://www.hpl.hp.com/personal/Larry_Rosler/
lr@hpl.hp.com



==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe