[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [FWP] Matching regexp's at a stream of data?



On Thu, Aug 03, 2000 at 04:43:10AM +0200, Sven Neuhaus wrote:
> On Wed, Aug 02, 2000 at 08:54:19PM -0400, abigail@foad.org wrote:
> > local $_;
> > sysread $fh => $_, $maxchars or return;
> > do {s/^pattern1/replacement1/ and next;   # Note the anchor.
> >     s/^pattern2/replacement2/ and next;
> >     ...
> > } while defined substr $_ => 0, 1, "" and
> >        (sysread $fh => $_, length, 1 or length);
> 
> That's probably too slow, isn't it?
> 
> I was thinking when I have a max match size of, say, 400 bytes, the
> algorithm looks at 600 bytes then slides the window 200 bytes further.
> There must be some overlap or you will miss some matches. Testing 
> every byte is too slow, though (haven't benchmarked it, but I'd be
> fairly surprised if it weren't).


What makes you think that? Sure, reading 200 bytes chunks makes you
have less I/O, but you loose the anchor in the regex, making the regexes
potentially a lot slower. 

I would be very surprised if one method is faster than the other without
taking the actual data and regexes in consideration.



Abigail

==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe