The 'perlre' and 'perlop' manpages are the places to start. Chapter 6 in the Perl Cookbook is also excellent. For example Recipe 6.2 in the latter, "Matching Letters", is probably close to what you actually want. I don't wish to plagiarize, but in brief, you use the locale package (read 'perllocale'), and employ a RE like $string =~ /^[^\W\d_]+$/ \w matches an alphabetic, a digit, or underscore (_). \W is everything else. So to get alphabetics, we specify that we do NOT want the "everything else", or digits, or the underscore. The _first_ ^ and the $ anchor the match to the beginning and end of the string. I'm not sure I understand why you would want a pattern that matches every character in your character set. :-) In terms of extracting following strings, if you have 'wholetext' in variable $wholetext, and 'starting_string' in $starting_string, then the RE ($matched) = ($wholetext =~ /$starting_string(.*)/); will return everything after 'starting_string' in variable $matched. I have deliberately left the match greedy; you might wish to change that, or if you want to retrieve each following match for all occurrences of 'starting_string', then you use the 'g' modifier. Theoretically proper use of 'locale' and POSIX will account for Unicode, and your RE's should work. Mind you, some concepts like \b for a word boundary may not exist for a given script. Arved Sandstrom At 10:47 PM 2/26/00 +0200, miku wrote: [ Brutal snippage ] >More to the point, my problem is: if I do a pattern search, non-letter >non-number characters like "." within the starting string might be >interpreted as wildcards or other embedded options/commands >(meta-characters). How can I make the search interpret its pattern as a >plain-text string that might contain *all* 256 characters of my character >set? And how can I construe and apply a mapping function that extracts all >single characters following the starting string in "wholetext"? And, for >future expansion: assuming a unicode input: how to deal with, for example, >the large variety of Chinese characters? In which ways would the script >probably have to be adapted? > # ===== Want to unsubscribe from this list? # ===== Send mail with body "unsubscribe" to macperl-request@macperl.org