[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] Sorting lines in a file



on 5/20/99 8:51 AM, Chris Nandor wrote...

>At 19.19 -0400 1999.05.19, Scott Prince wrote:
>>#!perl
>>@lines = ('a\tb\tc',
>>          'e\ta\tz',
>>          'x\ty\ts',)
>>$sortby = '2'; # or whatever field you want
>>@listrefs = map { [$_, (split(/\t/, $_))[$sortby] ] } @lines;
>>@newrefs  = sort { $a->[1] cmp $b->[1] } @listrefs;
>>@lines     = map { $_->[0] } @newrefs;
>>foreach $l (@lines) { print qq!$l\n!; }
>>exit;
>
>That uses the great Schwartzian Transform (map sort map) technique, but it
>would likely be more efficient if you didn't assign to a new variable each
>step.
>
>@lines =  map { $_->[0] }
>          sort { $a->[1] cmp $b->[1] }
>          map { [$_, (split(/\t/, $_))[$sortby] ] }
>          @lines;

Nice :)

>Not a huge deal, but might get some sizable wins on larger data sets, as
>assigning a large list to an array can be a drag if you don't need to do it.

Just for fun one night, I timed various sort routines on a largish file 
and found that the fastest sorts were never the ones I would have 
guessed. I'll have to give yours a try. I also wonder about the memory 
requirements for a large data set. But for measuring memory requirements, 
the only indicator I know of is when Perl crashes, as a result, I have 
developed an aversion doing to this type of test. 


Scott


>--
>Chris Nandor          mailto:pudge@pobox.com         http://pudge.net/
>%PGPKey = ('B76E72AD', [1024, '0824090B CE73CA10  1FF77F13 8180B6B6'])

===== Want to unsubscribe from this list?
===== Send mail with body "unsubscribe" to macperl-request@macperl.org