Thanks again, I don't know if you got my first response, the server seemed not to accept any messages yesterday. Michael > Von: Arved Sandstrom <Arved_37@chebucto.ns.ca> > Datum: Sat, 26 Feb 2000 23:30:28 -0400 > An: miku <miku@onlinehome.de> > Cc: Mac Perl <macperl@macperl.org> > Betreff: Re: [MacPerl] poets mix problems > > The 'perlre' and 'perlop' manpages are the places to start. Chapter 6 in the > Perl Cookbook is also excellent. > > For example Recipe 6.2 in the latter, "Matching Letters", is probably close > to what you actually want. I don't wish to plagiarize, but in brief, you use > the locale package (read 'perllocale'), and employ a RE like > > $string =~ /^[^\W\d_]+$/ > > \w matches an alphabetic, a digit, or underscore (_). \W is everything else. > So to get alphabetics, we specify that we do NOT want the "everything else", > or digits, or the underscore. The _first_ ^ and the $ anchor the match to > the beginning and end of the string. > > I'm not sure I understand why you would want a pattern that matches every > character in your character set. :-) > > In terms of extracting following strings, if you have 'wholetext' in > variable $wholetext, and 'starting_string' in $starting_string, then the RE > > ($matched) = ($wholetext =~ /$starting_string(.*)/); > > will return everything after 'starting_string' in variable $matched. I have > deliberately left the match greedy; you might wish to change that, or if you > want to retrieve each following match for all occurrences of > 'starting_string', then you use the 'g' modifier. > > Theoretically proper use of 'locale' and POSIX will account for Unicode, and > your RE's should work. Mind you, some concepts like \b for a word boundary > may not exist for a given script. > > Arved Sandstrom > > At 10:47 PM 2/26/00 +0200, miku wrote: > > [ Brutal snippage ] > >> More to the point, my problem is: if I do a pattern search, non-letter >> non-number characters like "." within the starting string might be >> interpreted as wildcards or other embedded options/commands >> (meta-characters). How can I make the search interpret its pattern as a >> plain-text string that might contain *all* 256 characters of my character >> set? And how can I construe and apply a mapping function that extracts all >> single characters following the starting string in "wholetext"? And, for >> future expansion: assuming a unicode input: how to deal with, for example, >> the large variety of Chinese characters? In which ways would the script >> probably have to be adapted? >> > > > > # ===== Want to unsubscribe from this list? > # ===== Send mail with body "unsubscribe" to macperl-request@macperl.org > > # ===== Want to unsubscribe from this list? # ===== Send mail with body "unsubscribe" to macperl-request@macperl.org