[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl-AnyPerl] Re: seperate large file into smaller files



>thanks ronald and bruce
>actually both your suggestions took exactly the same amount of time on a 40mb
>file.
>i have an additional question i forgot last time.
>i' d like to get the linenumber for the very last character of  the large file
>before actually doing the seperating-procces - are there a special variable or
>
>> #!perl -w
>> my $file = "hit.txt";
>> my $lines= 0;
>> open BIG, $file or die "Can't open $file: $!\n";
>> open NEW, ">$lines.txt" or die "Can't open $lines.txt: $!\n";
>> while (<BIG>) {
>>      print NEW $_;
>>      next if $. % 1000;

Use the Shuck pod viewing application to check out the following...

>From perlvar.pod...
 $.
    The current input line number for the last file handle
    from which you read (or performed a seek or tell on).
    An explicit close on a filehandle resets the line number.
    Because "<>" never does an explicit close, line numbers
    increase across ARGV files (but see examples under eof()).
    Localizing $. has the effect of also localizing Perl's notion
    of "the last read filehandle".  (Mnemonic: many programs use "."
    to mean the current line number.)


>From perlfunc.pod...
 close FILEHANDLE
    Closes the file or pipe associated with the file handle,
    returning TRUE only if stdio successfully flushes buffers
    and closes the system file descriptor.  <SNIP>
    You don't  have to close FILEHANDLE if you are immediately
    going to do another open() on it, because open() will close
    it for you. (See open().)  However, an explicit close on an
    input file resets the line counter ($.), while the implicit
    close done by open() does not.


>>      close NEW;
>>      $lines = $.;
>>      open NEW, ">$lines.txt" or die "Can't open $lines.txt: $!\n";
>> }
>>
>> __END__
>


>> #!perl
>>
>> use strict;
>>
>> my $in  = shift @ARGV;     # input file
>> my $out = $in;             # output file base name
>> my $suf = '000';           # initial suffix to add to base name
>>
>> my $max  = 1000;           # max number of lines per output file
>> my $curr = 0;              # number of lines read for current file
>>
>> open(IN, $in)                                # open input file
>>   or die "Unable to open $in: $!\n";
>>
>> open(OUT, '>' . $out . $suf++)               # open first output file
>>   or die "Unable to open $out: $!\n";
>>
>> while (<IN>) {                               # read one line
>>   if ($curr++ == $max) {                     # if at max

$curr counts the lines


>>     open(OUT, '>' . $out . $suf++)           # open next output file
>>       or die "Unable to open $out: $!\n";
>>     $curr = 1;                               # reset current
>>   }
>>
>>   print OUT $_;                              # print this line
>> } # while (<IN>)
>>
>> __END__
>
>

David Seay
http://www.mastercall.com/g-s/




==== Want to unsubscribe from this list?
==== Send mail with body "unsubscribe" to macperl-anyperl-request@macperl.org