[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] The MacPerl Pages and book (update)



At 17:27 -0800 29/10/1997, Peter Prymmer wrote:
>I was awaiting either the "MacPerl Oddities" or perhaps the
>"Idioms And Programming Paradigms" chapter to mention something that I hadn't
>realized til this past August when I heard Ken Lunde's presentatin at the
>Perl
>Conference.  In it he remarked that MacPerl's \w regexp meta-character will
>match some of the high bit characters in the Mac extended ascii code page.
>This is perhaps worth careful emphasis not only because of its difference
>from
>unix but also in light of the use of the ISO-Latin-1 charset in cgi scripts,
>where character set rendering issues become blurry.

>P.S. a test script that exhibits this behavior is simple to construct: put
>some
>regular chars and option-chars into a $scalar then examince the array
>returned
>by a split(/\w/,$scalar) and see where the funny chars lie.  e.g.
>
>print(join(">",split(/\w/,"string with funny chars")));
>
>(where I have avoided actual 8-bit chars to prevent accidental MIME-ification
>of this email.)

How about:
$string_with_funny_chars = join("",map {chr} (32 .. 255) ) ;
print $string_with_funny_chars,"\n";
print(join("*",split(/\w/,$string_with_funny_chars))),"\n";

(that shouldn't be prone to MIME-garbling, except the "=", perhaps.)

For MacPerl 514b2 this gives stars only for [0-9A-Za-z_], which is not
different from IRIX. It doesn't match any character with ord() >= 128.

However, on IRIX, if I do (stolen from perldoc POSIX):
use locale;
use POSIX;
POSIX::setlocale( &POSIX::LC_ALL, "es_AR.ISO8859-1" );

then all the ISO-8859-1 accented letters become stars. Obviously this
doesn't work with MacPerl, as the Mac doesn't use ISO-8859-1, and the
locale pragma is not supported with MacPerl, it seems.

So I don't quite get what you mentioned about Ken Lunde's presentation?
As I see it, the only difference is that on Unix you can use locale to
change the behaviour of \w, whereas with MacPerl you cannot. Not that this
wouldn't be desirable, mind you.

I suppose it would require implementing a locale pragma and some changes to
POSIX to permit stuff like:
POSIX::setlocale( &POSIX::LC_ALL, "de_CH.MacCharSet" );


-Lasse



***** Want to unsubscribe from this list?
***** Send mail with body "unsubscribe" to mac-perl-request@iis.ee.ethz.ch