[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl-AnyPerl] Regular Expression Problem



Quoting Richard Gordon <maccgi@bellsouth.net>:

> I need to develop a regular expression that will embed html links in 
> some text that contains extensive cross references. For example:
> 
> See also blue; green; red.
> 
> needs to become:
> 
> See also <A HREF="/cgi-bin/myscript.pl?blue">blue</A>; <A 
> HREF="/cgi-bin/myscript.pl?green">green</A>; <A 
> HREF="/cgi-bin/myscript.pl?red">red</A>.
> 
> Finding the references and dealing with the first one within each 
> group is no problem, but I am not clear about how to handle all of 
> them before moving on to find the next instance of See also. The 
> ground rules are that the passages will always start with See also 
> followed by a space and at least one reference which could have 
> multiple words. If there are multiple references, they are delimited 
> with a semi-colon and, in any case, a period terminates the entire 
> string. The way that the text file is structured, everything will be 
> on one physical line.
> 
> My thought was to first do a match for /See also [^\.]+\.+?/ and then 
> try to do something with the results with s/$&// , but it isn't clear 
> to me how to approach this? Would I need to make multiple passes or 
> what? Thanks for any help you may be able to offer.
> 
Just realized why my replies aren't showing up on the list - using the wrong
Reply type in Linux IMP. Anyway, I hope folks have been getting their 
individual replies.

As to the above, for more general edification, my suggestion is to use the /e 
R.E. modifier:

#!perl -w

open(INF, "$file") or die;
$/ = undef;
$text = <INF>;
close INF;

$text =~ s/See\salso\s(.+)\./&new_refs($1).'.'/ge;
print $text;

sub new_refs {
   my $oldstr = shift;
	
   my @links = split(/;/, $oldstr);
   my $newstr = "See also ";
   foreach $link (@links) {
      $newstr .= "<A HREF=\"/cgi-bin/myscript.pl?$link\">$link</A>;";
   }
   chop $newstr;   # remove last ';'
   $newstr;
}

__END__

This is rough and ready, but it works. Arved


--------------------------------------------------------------
This mail sent through Chebucto IMP: http://IMP.Chebucto.NS.Ca

==== Want to unsubscribe from this list?
==== Send mail with body "unsubscribe" to macperl-anyperl-request@macperl.org