[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] Searching a VERY large text file



>>Is there some way of compiling MacPerl scripts into machine language or
>>converting them into C/C++ routines that I could then read into
>>CodeWarrior... Maybe this would increase speed - then again maybe not??
>
>On Thu, 5 Feb 1998 10:36:55 +0000, Bart Lateur replied:
>
>I think that, if you read in the data via sysread/read in blocks of,
>say, 32k, and scan those, you might get a decent speed incease.
>
>But the code won't look as nice: you have to consider (and deal with)
>the possibility of missing a possible match on the edge between two
>blocks.
>
>	Bart.
>


I agree with Bart.  The following subroutine shows how I'd code the search.
The line after the 'else' statement avoids the problem of missing a
possible match on the edge between two blocks.


sub search_file {
  $find_text="Put text to search here.";
  $length_to_read=32768; # READ 32K AT A TIME -- INCREASE IF YOU HAVE THE
MEMORY
  $offset = 0;
  $find_text_length=length($find_text) + 2;
  open(CHECKFILE, $pathname) || die "Could not open $pathname.\n";
  while ($buffer = read(CHECKFILE, $read_results, $length_to_read, $offset)){
     if (grep(/$find_text/i,$read_results)) {
        $read_results=lc($read_results);
        $find_this=lc($find_text);
        $i=index($read_results,$find_this);
        $found_it++;
        $found_loc= $offset + $i + 1;
        print "Found $find_text at byte $found_loc in... \"$pathname\" \n";
        last;
     } else {
        $offset = $offset + $length_to_read - $find_text_length;
     }
  }
  close(CHECKFILE);
}


David Seay
http://www.mastercall.com/g-s



***** Want to unsubscribe from this list?
***** Send mail with body "unsubscribe" to mac-perl-request@iis.ee.ethz.ch