[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [FWP] Matching regexp's at a stream of data?



On Wed, Aug 02, 2000 at 12:06:37AM -0700, Yitzchak Scott-Thoennes wrote:
> In article <20000802072049.A15975@neopoly.de>,
> Sven Neuhaus <sn@neopoly.de> wrote:
> > The problem I'm facing is finding a good way to match
> > (possibly many) regexps against a stream. The stream may be very 
> > large so I want to start doing matched before the end of the stream
> > has been reached.
> > To make this easier, every regexp has a parameter that specifies 
> > the maximum number of bytes to be matched.
> > 
> > Any clever ideas how to code this in perl in a elegant and
> > fast manner?
> 
> Something roughly like this: (presupposing a get_chars function
> that requests a number of chars but may return less than that,
> and that returns an eof flag either on the read that got the
> last char(s) of the input or on the (empty) read afterward).
> 
> $maxchars = 200;  # max chars used by any of the regexen
> $buffer = '';
> $eof = 0;
> 
> until ($eof && $buffer eq '') {
>    $buffer. = get_chars($maxchars * 2 - length $buffer, $eof) unless $eof;
> 
>    if ($eof || length $buffer >= $maxchars) {
> 
>       $buffer =~ s/from1/to1/g;
>       $buffer =~ s/from2/to2/g;
>       ...
> 
>       put_chars(substr($buffer, 0, ($eof ? $maxchars*2 : -$maxchars), ''));
>    }
> }


local $_;
sysread $fh => $_, $maxchars or return;
do {s/^pattern1/replacement1/ and next;   # Note the anchor.
    s/^pattern2/replacement2/ and next;
    ...
} while defined substr $_ => 0, 1, "" and
       (sysread $fh => $_, length, 1 or length);



Abigail

==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe