[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] poets mix problems



Thanks again, I don't know if you got my first response, the server seemed
not to accept any messages yesterday.

Michael

> Von: Arved Sandstrom <Arved_37@chebucto.ns.ca>
> Datum:  Sat, 26 Feb 2000 23:30:28 -0400
> An: miku <miku@onlinehome.de>
> Cc: Mac Perl <macperl@macperl.org>
> Betreff: Re: [MacPerl] poets mix problems
> 
> The 'perlre' and 'perlop' manpages are the places to start. Chapter 6 in the
> Perl Cookbook is also excellent.
> 
> For example Recipe 6.2 in the latter, "Matching Letters", is probably close
> to what you actually want. I don't wish to plagiarize, but in brief, you use
> the locale package (read 'perllocale'), and employ a RE like
> 
> $string =~ /^[^\W\d_]+$/
> 
> \w matches an alphabetic, a digit, or underscore (_). \W is everything else.
> So to get alphabetics, we specify that we do NOT want the "everything else",
> or digits, or the underscore. The _first_ ^ and the $ anchor the match to
> the beginning and end of the string.
> 
> I'm not sure I understand why you would want a pattern that matches every
> character in your character set. :-)
> 
> In terms of extracting following strings, if you have 'wholetext' in
> variable $wholetext, and 'starting_string' in $starting_string, then the RE
> 
> ($matched) = ($wholetext =~ /$starting_string(.*)/);
> 
> will return everything after 'starting_string' in variable $matched. I have
> deliberately left the match greedy; you might wish to change that, or if you
> want to retrieve each following match for all occurrences of
> 'starting_string', then you use the 'g' modifier.
> 
> Theoretically proper use of 'locale' and POSIX will account for Unicode, and
> your RE's should work. Mind you, some concepts like \b for a word boundary
> may not exist for a given script.
> 
> Arved Sandstrom
> 
> At 10:47 PM 2/26/00 +0200, miku wrote:
> 
> [ Brutal snippage ]
> 
>> More to the point, my problem is: if I do a pattern search, non-letter
>> non-number characters like "." within the starting string might be
>> interpreted as wildcards or other embedded options/commands
>> (meta-characters). How can I make the search interpret its pattern as a
>> plain-text string that might contain *all* 256 characters of my character
>> set? And how can I construe and apply a mapping function that extracts all
>> single characters following the starting string in "wholetext"? And, for
>> future expansion: assuming a unicode input: how to deal with, for example,
>> the large variety of Chinese characters? In which ways would the script
>> probably have to be adapted?
>> 
> 
> 
> 
> # ===== Want to unsubscribe from this list?
> # ===== Send mail with body "unsubscribe" to macperl-request@macperl.org
> 
> 


# ===== Want to unsubscribe from this list?
# ===== Send mail with body "unsubscribe" to macperl-request@macperl.org