[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [FWP] Matching regexp's at a stream of data?



In article <20000802072049.A15975@neopoly.de>,
Sven Neuhaus <sn@neopoly.de> wrote:
> The problem I'm facing is finding a good way to match
> (possibly many) regexps against a stream. The stream may be very 
> large so I want to start doing matched before the end of the stream
> has been reached.
> To make this easier, every regexp has a parameter that specifies 
> the maximum number of bytes to be matched.
> 
> Any clever ideas how to code this in perl in a elegant and
> fast manner?

Something roughly like this: (presupposing a get_chars function
that requests a number of chars but may return less than that,
and that returns an eof flag either on the read that got the
last char(s) of the input or on the (empty) read afterward).

$maxchars = 200;  # max chars used by any of the regexen
$buffer = '';
$eof = 0;

until ($eof && $buffer eq '') {
   $buffer. = get_chars($maxchars * 2 - length $buffer, $eof) unless $eof;

   if ($eof || length $buffer >= $maxchars) {

      $buffer =~ s/from1/to1/g;
      $buffer =~ s/from2/to2/g;
      ...

      put_chars(substr($buffer, 0, ($eof ? $maxchars*2 : -$maxchars), ''));
   }
}

This is completely untested and obviously could use some beautification.

==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe