At 11:44 PM -0800 2/17/98, Stephan Somogyi wrote: >I have to churn through around a gigabyte of text. After having sorted >out the basic algorithmic issues, I am now looking to optimize. > >One of the more time critical operations is the dual pattern-matche >that I perform on each line of text with the following: > > unless (($curLine =~ /\[\d/) && ($curLine =~ /\"$\/\Z/)) > >Since I really need to know only whether these patterns are in the >string at all, I would prefer to use ?? rather than // patterns since, >if I replace the // delimiters with ?? delimiters, I see a 14% >performance improvement; unfortunately, the line of code also stops >working thereafter. Just to make sure, I printed out $& after the line >and it seemed to be matching the latter expression but not returning >true. > >After some digging around, I discovered that ? patterns have the >irksome need for a reset after every use. When I added the necessary >resets, I found myself at 14% worse than when I started, ie 28% slower >than with the ? patterns. Put o on the second regexp so it won't have to be recompiled every time just in case you change $/. ($curLine =~ /\"$\/\Z/o) (Unless you're changing $/, that is: even then, it might be faster to have several regexps, one for each possible value). This will be a big win. Are you sure this is the only thing you're changing? I can't see any reason ?? should be faster than // (in fact, it should be slower). If the second one matches less often, put it first to let you exit the && chain faster. It might be faster to combine them into one regexp if you know the order: unless ($curLine =~/\[\d.*\"$\/\Z/o). - Tim Tim Dierks - timd@consensus.com - www.consensus.com Director of Engineering - Consensus Development Developer of SSL Plus: SSL 3.0 Integration Suite ***** Want to unsubscribe from this list? ***** Send mail with body "unsubscribe" to mac-perl-request@iis.ee.ethz.ch