[Date Prev][Date Next][Thread Prev][Thread Next]
[Search]
[Date Index]
[Thread Index]
Re: [MacPerl-AnyPerl] Regular Expression Problem
At 16:50 -0400 4/29/1999, Ronald J. Kimball wrote:
A more efficient regex would use a negated character class with greedy
matching, as in:
s/See also ([^.]+)\./ whatever /eg;
Ronald
Campaign to Stamp Out Needless Use of Non-Greedy Matching
:)
I did run into a greed-related problem along with a few others. What I wound up with follows (I don't know about efficiency, but while this won't even run on my Mac due to memory exhaustion, it blows thru a 5 meg file on Solaris in less than 2 seconds and seems to do exactly what I wanted):
#!/usr/bin/perl -w
######
# Declare variables:
$/ = undef ;
$indir = "/home/sbl-home/EPubs/d/" ;
$outdir = "/home/sbl-home/EPubs/d/" ;
$infile = "dictionary2.txt" ;
$outfile = "dictionary.txt" ;
######
# Begin program:
$input = $indir.$infile;
$output = $outdir.$outfile;
open(IN, "<$input") ||
die "Can't open $infile $!\n";
$text = <IN>;
close(IN);
open(OUT, ">$output") ||
die "Can't open $outfile $!\n";
select(OUT);
$text =~ s/<I>See\salso<\/I>\s([^\.]+)\.+?/&see_also_refs($1).'.'/ge;
$text =~ s/<I>See<\/I>\s([^\.]+)\.+?/&see_refs($1).'.'/ge;
print $text;
######
# Subroutines
sub see_also_refs {
my $oldstr = shift;
my @words = split(/;\s?/, $oldstr);
my $newstr = "<I>See also</I> ";
foreach $word (@words) {
$href1 = "term=";
$href2 = "&case=i";
$href = "$href1$word$href2";
$href =~ s/([^=&a-zA-Z0-9_\-.,])/uc sprintf("%%%02x",ord($1))/eg;
$newstr .= '<A HREF="/cgi-bin/SBL/searchdict.pl?'.$href.'">'.$word.'</A>; ';
}
chop $newstr;
$newstr;
}
sub see_refs {
my $oldstr = shift;
my @words = split(/;\s?/, $oldstr);
my $newstr = "<I>See</I> ";
foreach $word (@words) {
$href1 = "term=";
$href2 = "&case=i";
$href = "$href1$word$href2";
$href =~ s/([^=&a-zA-Z0-9_\-.,])/uc sprintf("%%%02x",ord($1))/eg;
$newstr .= '<A HREF="/cgi-bin/SBL/searchdict.pl?'.$href.'">'.$word.'</A>; ';
}
chop $newstr;
$newstr;
}
######
# End program
Richard Gordon
--------------------
Gordon Consulting & Design
Database Design/Scripting Languages
mailto://maccgi@bellsouth.net
770.565.8267