[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] bug in MIME::QuotedPrint ?



summary:
MIME does not reallyc are about character interpretation between platforms.
A solution is possible though by converting from Mac to ISO character set.

My simple and I think worthwhile idea is to have a mechanism for
sending non US-ASCII characters through MacPerl generated email.

At 8:15 Uhr -0500 02.02.2000, Paul Schinder wrote:
>At 11:31 AM +0100 2/2/00, Axel Rose wrote:
>>...
>>
>>A german umlaut "ä" happens to be 0xe4 in ISO-8859-P1 charset
>>(also Latin-1, Windows) but is 0x8a in the Mac charset and
>>certainly something weird in EBCDIC.
>>
>>But the whole point of MIME encoding is to transport such characters
>>between different systems.
>
>The whole point of MIME is to safely transport bytes with the high bit set through 7-bit transport mechanisms such as email.  So far as I know, there is no concept of transporting "characters between different systems".  It's a way of making sure that *bytes* arrive intact.

RFC 2045 says
   The term "character set" is used in MIME to refer to a method of
   converting a sequence of octets into a sequence of characters.  Note
   that unconditional and unambiguous conversion in the other direction
   is not required, in that not all characters may be representable by a
   given character set and a character set may provide more than one
   sequence of octets to represent a particular sequence of characters.

So please forgive me for saying
>>.. an "ä" has to be encoded into "=E4"
I thought that MIME does care about such things because that was the
impression from my daily use of Mac EMail programs. By using MIME 
they do just what I want - transparent use of 8bit characters between
platforms.

After reading more of the RFC I found
Content-Type: text/plain; charset=iso-8859-1
I couldn't find a charset definition mac-roman.
And this is a possible explanation.

8bit texts are converted to the ISO charset and afterwards encoded.

That way it is possible to find the solution I need.
I will convert my mail texts to the ISO charset and then encode with
the MIME module. No change of existing modules is neccessary.


>I thought both MIME and HTML require the use of a specific character set

For traditional HTML I would say no, e.g. a ß should be the
same everythere, same as german_double_s in PostScript.

>>MacPerl's ord() function is not portable with 8bit characters.
>>I have no idea whether this is known nor do I have a workaround yet.
>
>Nonsense.  MacPerl's ord() function is doing exactly what you've told it to do.  It has no idea how a certain byte appears on your screen. Nor *can* it know, since on a Mac you can so easily change fonts.

I disagree. ord() gives different results on different platforms.
I claimed portability.

>  If this really troubles you, start using an ISO font on your Mac.  IIRC several have been mentioned on this list in the past.

I would like to know how to do this an 8.x systems.
PostScript fonts may have their own encoding vector, true. But this
has effect in printing only. A keyboard input will produce the
same code everytime.

(don't tell me about MacOSX Server systems - they refuse to accept
even a single quote from a german keyboard :(()



Axel


---------------------------------------------------------------------------
>> >#!perl -w
>> >
>> >use MIME::QuotedPrint;
>> >
>> >$string = "ä";
>> >$estring = encode_qp( $string );
>> >print "encode_qp( $string ) = $estring\n";
>> >
>> >__END__
>> >
>> >Result is
>> >encode_qp( ä ) = =8A
>> >
>> >Running under Linux I get the proper result
>> >encode_qp( ä ) = =E4
>> >
>> >
>> >Looking into the source code I find
>> >...
>> >$res =~ s/([^ \t\n!-<>-~])/sprintf("=%02X", ord($1))/eg;  # rule #2,#3
>> >$res =~ s/([ \t]+)$/
>> >	join('', map { sprintf("=%02X", ord($_)) }
>> >		split('', $1)
>> >      )/egm;                        # rule #3 (encode whitespace at eol)
>>... 

----------------------------------------------------------------------
Axel Rose, Springer & Jacoby Digital GmbH & Co. KG, mailto:rose@sj.com
  pub PGP key 1024/A21CB825 E0E4 BC69 E001 96E9  2EFD 86CA 9CA1 AAC5
  "... denn alles, was entsteht, ist wert, daß es zugrunde geht ..."

# ===== Want to unsubscribe from this list?
# ===== Send mail with body "unsubscribe" to macperl-request@macperl.org