[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [FWP] innermost first parsing



On Aug 2, Randal L. Schwartz said:

>I think this is probably just as workable, and more reusable:
>
>    while ($text = s/
>      BEGIN
>        ((
>          (?!BEGIN|END) # not looking at BEGIN or END
>          . # inch along
>        )*)
>      END
>    /SOMETHING/xs) {
>      "do something with $1";
>    }

I know about this, but specifically avoided it, due to the inch-along
factor.  I was trying an unrolling the loop technique (re MRE).

Sorry about the misleading comment, too...

As for Abigail's comment, I'm sure I could work together something:

AAB 1 AAB 2 CCD 3 CCD

  m{
    AAB
    (
      [^AC]*
      (?:
        (?:
          A+ (?! (?<! AA ) B )
          |
          C+ (?! (?<! CC ) D )
        )
        [^AC]*
      )*
    )
    CCD
  }

I've not tested it yet, but it seems workable, since my FIRST regex can
really be

  m{
    BEGIN
    (
      [^BE]*
      (?:
        (?:
          B+ (?! (?<! B ) EGIN )
          |
          E+ (?! (?<! E ) ND )
        )
        [^BE]*
      )*
    )
    END
  }

So, as for Abigail's comment, this regex pattern is easily
manufactured.  I guess benchmarks are in order.  I don't like the
.-at-a-time method, since it seems slow to me.

-- 
Jeff "japhy" Pinyan     japhy@pobox.com     http://www.pobox.com/~japhy/
PerlMonth - An Online Perl Magazine            http://www.perlmonth.com/
The Perl Archive - Articles, Forums, etc.    http://www.perlarchive.com/
CPAN - #1 Perl Resource  (my id:  PINYAN)        http://search.cpan.org/


==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe