[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] bug in MIME::QuotedPrint ?



At 18.56 +0100 2000.02.03, Axel Rose wrote:
>I can follow your explanation.
>But consider an application I distribute to many Mac users in my company,
>a MacPerl standalone which pops up a standard dialog where user may type in
>a short message.
>They use the standard font in a standard dialog.
>I can't explain them to use different fonts and they will never accept
>different keys for the same special character. Simply because that
>little "ü" is printed onto a physical key of the keyboard.

It does not change the fact that ord() is quite portable.  It just has no
comprehension whatsoever of character sets.  It only knows bytes, not how
they are represented.


>>Window 1 (ProFont):
>>  print ord 'u';  # this is option-shift-u, u (u with umlaut)
>>Returns:
>>  159
>>
>>Window 2 (ProFontISOLatin1):
>>  print ord 'u';  # this is option-shift-z (u with umlaut)
>>Returns:
>>  252
>
>ord() gives different results depending on font - very strange.

No, it is expected.  If it were to try to understand the character set your
font uses, that would be strange.  :)

>This only leads to question whether the keyboard input, the
>rendered character or something else is considered as reference.

It is the underlying byte.  When you have a hunk of text, it is not
rendered characters that are stored in memory, whether in RAM or on disk,
it is a byte.  ord() just tells you the decimal value of that byte,
regardless of how it is rendered on your screen.

And again, the Mac keyboard sequences like (opt-u, u) and (opt-shift-z)
always produce the same byte, regardless of font.  But then the font
renders that byte as it sees fit.  A MacRoman font will always produce u
with an umlaut for the first key sequence; an ISOLatin1 font will always
produce the same character with the second sequence, because opt-u always
produces a byte with the decimal value 159, and in MacRoman, that is the
value that produces the u with an umlaut.  In ISO, the value that produces
u with an umlaut is 252, and that value is produced by Mac OS with the key
sequence opt-shift-z.

Summary: keyboard sequences produce bytes, not characters.  ord() gets the
value of bytes, not characters.  Fonts use character sets to determine
which bytes are mapped to which characters.  Keyboards and ord() and most
other things do not know anything about character sets, even when they say
they do.  They just know bytes.

-- 
Chris Nandor          mailto:pudge@pobox.com         http://pudge.net/
%PGPKey = ('B76E72AD', [1024, '0824090B CE73CA10  1FF77F13 8180B6B6'])

# ===== Want to unsubscribe from this list?
# ===== Send mail with body "unsubscribe" to macperl-request@macperl.org