[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] Recognizing Unix Line Breaks

At 6.51 97/5/5, Lasse Hiller¿e Petersen wrote:
>I think this is actually a general problem: Perl handles bytes, not
>characters, so a similar problem exists with for instance Unicode
>characters. How does Perl deal with two-byte characters? (I know this is
>not MacPerl specific, but it is certainly relevant.)

It handles multibyte byte by byte.  A two-byte character is two separate
entities to Perl.  Jeffrey Friedl -- who wrote the regex book for O'Reilly
and lives (ised to live?) in Japan, where multibyte characters are normal
-- wrote a piece in the most recent issue of The Perl Journal concerning
regexs and multibyte characters.

Chris Nandor                 pudge@pobox.com                 http://pudge.net/
%PGPKey=('B76E72AD',[1024,'08 24 09 0B CE 73 CA 10  1F F7 7F 13 81 80 B6 B6'])
To me, clowns aren't funny. In fact, they're kinda scary. I've wondered where
this started, and I think it goes back to the time I went to the circus and a
clown killed my dad.
                --Jack Handey

***** Want to unsubscribe from this list?
***** Send mail with body "unsubscribe" to mac-perl-request@iis.ee.ethz.ch