[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [FWP] Interpreting a columnar text file.



>>>>> "Chaim" == Chaim Frenkel <chaimf@pobox.com> writes:

Chaim> Once the regexp matches, and a substitution is done. Isn't there an
Chaim> implicit \G in there? Can the regexp backtrack behind the replacement?

No, it's a *drift forward* that I'm worried about.

Suppose I want to take a series of words and spaces and bracket each
word, tossing spaces:

    $_ = "this is a series of     words";
    s/\s*(\w+)/[$1]/g;
    print;

    [this][is][a][series][of][words]

But that's a bad regex, because I didn't think about erroneous data:

    $_ = "this is, a series of words!";
    s/\s*(\w+)/[$1]/g;
    print;

    [this][is],[a][series][of][words]!

See.. the other garbage is still in there.  Perl was free to find the
later replacements, even though unanticipated data was present.  By
adding \G:

    $_ = "this is, a series of words!";
    s/\G\s*(\w+)/[$1]/g;
    print;

    [this][is], a series of words!

I now get far fewer substitutions than before, making it much more
obvious what is going on.

I use \G whenever I want to make sure that a /g match should match
*EVERY* single character in the string, not just drift forward to pick
and choose what it wants.

-- 
Name: Randal L. Schwartz / Stonehenge Consulting Services (503)777-0095
Keywords: Perl training, UNIX[tm] consulting, video production, skiing, flying
Email: <merlyn@stonehenge.com> Snail: (Call) PGP-Key: (finger merlyn@teleport.com)
Web: <A HREF="http://www.stonehenge.com/merlyn/">My Home Page!</A>
Quote: "I'm telling you, if I could have five lines in my .sig, I would!" -- me

==== Want to unsubscribe from Fun With Perl?
==== Well, if you insist... Send mail with body "unsubscribe" to
==== fwp-request@technofile.org