[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [FWP] name-a-band



Andy Lester <andy@petdance.com> writes:
> I wrote up a little band name generator that sucks up /usr/dict/words and
> cranks out band names.  They turn out to be eerily feasible.

It's a fun idea.  I'm getting things like "Duskish Batwing", "The Overcut
Pointwise of Unstuffing" and "Semimarine Melanterite".  Somewhat surreal,
but I guess that's the point.

A couple of comments.

>     for ( 1..5 ) {

Aesthetics tell me that it's not nice to generate five random numbers if you
only want one.  I'll come back to this later.

>         push( @words, ucfirst @dict[ int(rand scalar @dict) + 1 ] );

If you're just getting a single array element, you should use the array
reference $dict[INDEX] instead of the single-element array slice
@dict[INDEX] (though it happened not to matter in this case).

Also, you can call rand with an array as its argument, and it gives you a
random element from the array.  This gives us

    push @words, ucfirst $dict[rand @dict];

which is easier to read and thus more Fun.

Now, can we do the task in one line?  Your code is 176 characters when
internal whitespace has been removed, or 182 characters if you use
/usr/dict/words instead of words.txt.  (This excludes the #! line, the "use
strict", the __END__, and the data.)  Let's go through your code again,
seeing where we can save some space.

> open( IN, "<words.txt" ) or die;
> my @dict = <IN>;
> close IN;
> chomp @dict;

This chunk gets the dictionary into @dict, one word per element, without
newlines.  Well, we will obviously save some space both here and below by
choosing smaller (single-character) variable names.  In this instance, I'm
going to use @_ for the dictionary.  Modifying @_ is usually poor style, but
it allows me to avoid my()-ing some other variable.  We can also do the
chomp and the diamond in one go: "chomp(@_ = <F>)".  (Even if we'd used @d
for the dictionary, "chomp(my @d = <F>)" would still have worked.)  We can
also drop the close, giving:

    open(F,"/usr/dict/words")||die;chomp(@_=<F>);

as the shrunken version of this chunk.  However, since we don't use @ARGV,
there's a sneaky trick we can play here:

    @ARGV="/usr/dict/words";chomp(@_=<>)||die;

Taking out the "||die" would save a further five characters at the expense
of error checking; I haven't gone that route.

The main part of your program loops over the data:

> while ( my $mask = <DATA> ) {
>     my @words;
>     for ( 1..5 ) {
>         push( @words, ucfirst @dict[ int(rand scalar @dict) + 1 ] );
>     }
>     printf( $mask, @words );
> }

Let's start by changing variable names again.  First of all, the @words
array.  We probably want to stick to an alphabetic name for this, because
the my() is helpful to reset it each time.  (Although I've just thought of
another trick; see at the end.)  I picked @w, which seemed to make sense.
But note that the $mask variable is a good candidate for $_, seeing as data
is read into it with a while loop over a diamond operator.  This gives us

    while (<DATA>) {
        my @w;
        for (....) {
            push @w, ucfirst $_[rand @_];
        }
        printf $_, @w;
    }

as the outline for this loop.

Next up is the ucfirst(): since this is just the implementation for \u in a
double-quotish string, lets use that instead:

    push @w, "\u$_[rand @_]"

Now lets move on to the inner loop.  Since it only contains one expression,
it's a good candidate for an statement modifier:

    push @w, "\u$_[rand @_]" for ...

Then the question is what the loop condition should be.  We could stick to
your method, giving us

    push@w,"\u$_[rand@_]"for 1..5

with minimal internal whitespace.  But if we count how many random words are
really needed, we can actually remove one (!) character:

    push@w,"\u$_[rand@_]"for/%/g

Here, the "for" modifier puts a list context on its argument, so the
(degenerately-spelled) m//g operator returns a list of all the matches.
Then each match is ignored within the loop -- notice that the apparent
occurrence of $_ is actually a reference to @_.

Blindly using % characters from format strings in this way is usually a bad
idea: if the format contains a "%%", you get the wrong answer.  However, in
this case, we would merely drop back to getting too many random entries,
which is an acceptable form of breakage.

So putting it all together, this gives us

    @ARGV="/usr/dict/words";chomp(@_=<>)||die;
    while(<DATA>){my@w;push@w,"\u$_[rand@_]"for/%/g;printf$_,@w}

as the whole thing.  That's 102 characters with minimal whitespace, or 96 if 
we change the file name.

The other trick that occurred to me while I was writing this message: we
don't actually need a nested loop at all.  If you undef $/ after reading the
dictionary, then you can get the entire DATA area into a scalar with a
single diamond operator.  If you read into $_, nothing else needs to be
changed.  However, to actually make a space saving on this, you need to drop
a couple extra characters.  Since we no longer need to reset the @w array on
each outer loop, the "my @w" is a good bet.  This means that we need to
either get rid of "use strict", or use a global (punctuation) variable name.
I've taken the latter option, so we now get

    @ARGV="/usr/dict/words";chomp(@_=<>)||die;
    undef$/;$_=<DATA>;push@*,"\u$_[rand@_]"for/%/g;printf$_,@*

This comes down to 100 characters, or 94 with the smaller file name.  But
that's still longer than a standard line, and I'm out of ideas.  Anyone
else?

-- 
Aaron Crane   <aaron.crane@pobox.com>   <URL:http://pobox.com/~aaronc/>

==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe