[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] "rewinding" a file



Quoth John Sprinter:

>I want to choose records from a file according to some criteria.
>Accumulate the chosen records; the number might be large (1000?).
>Sort the chosen records by a $keyfield
>Report the sorted results.
>
>Sounds like a Practical Extraction and Report problem.

The way I do this on my *n*x server is something like:

        #!/usr/local/bin/perl

        @chosen = ();

        open (THUD, "< the.whole.collection");
        while ($line = <THUD>) {
                chop $line;
                if &criteria ($line) {
                        $prefix = &makekey ($line);
                        push (@chosen, "$prefix\t$line");
                }
        }
        close (THUD);

        foreach $choice (sort @chosen) {
                ($key,$line) = split (/\t/, $choice);
                print "$line\n";
        }

There are a couple of obvious issues:

1)  Depending on the number and length of saved lines, this may be
    a memory hog.  Your mileage may vary.  If the reported data is
    more compact than the original data line, you might preprocess
    each line as far as possible in the first loop, saving with
    each key only the data required for reporting and summary.

2)  The use of \t as the key/line separator above (in push and split)
    is only for illustration.  Clearly one would want to use something
    that didn't appear in the line (or else write a more fancy splitter
    in the last loop, such as

                $choice =~ /^([^\t]+\t(.*)$/

    which (to quote an earlier post) "is left as an exercise for the
    reader"!

Clearly a classic "space for time" tradeoff...

-----------------------------------------------------------------------
-jn-                                         (personal opinion du jour)
-----------------------------------------------------------------------