[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] sort a file



At 17.33 12/16/97, Peter Furmonavicius wrote:
>Hello.  Can someone give an example of how to sort a file (large) in
>MacPerl without having to read the entire file into storage?  Thanks.

Unfortunately, not really.  :)

Someone on #perl just gave me a good idea.  Split the file into smaller
files, and sort those files.  Then merge sort.  So I played around and came
up with this.  Just set $file to your filename, and how many lines you
wanna suck in at once.


#!perl -w
my($count1, $count2, $chunk, @lines, $line, @fh, %fh);
$count1 = 0;
$count2 = 0;
$chunk = 5000; #lines to take in at once
$file = 'file1';

open(F, "<$file") || die($!);
while (defined($line=<F>)) {
  push @lines, $line;
  $count1++;
  print $count1,"\n" if ($count1 == $chunk);
  if ($count1 >= $chunk) {
    open(N, ">${file}_temp$count2") || die($!);
    print N sort @lines;
    close(N);
    $count1 = 0;
    $count2++;
    @lines=();
  }
}
close(F);

foreach (0 .. $count2-1) {
  push @fh, "${file}_temp$_";
  open($fh[$_], $fh[$_]) || die("$!: $fh[$_]");
}
foreach (@fh) {$fh{$_} = <$_>}

open(F, ">${file}_new") || die($!);
O: while (keys %fh) {
  I: foreach (sort {$fh{$a} cmp $fh{$b}} keys %fh) {
    print F $fh{$_};
    if (defined($line=<$_>)) {
      $fh{$_} = $line;
    } else {
      delete $fh{$_};
      print $_,"\n";
    }
    last I;
  }
}

foreach (@fh) {
  close($_);
  unlink($_);
}

__END__


--
Chris Nandor               pudge@pobox.com           http://pudge.net/
%PGPKey=('B76E72AD',[1024,'0824 090B CE73 CA10  1FF7 7F13 8180 B6B6'])
#==                    MacPerl: Power and Ease                     ==#
#==    Publishing Date: Early 1998. http://www.ptf.com/macperl/    ==#