[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] bug in MIME::QuotedPrint ?



At 11:31 AM +0100 2/2/00, Axel Rose wrote:
>Thank you Ronald and Paul for responding.
>
>You are right from your point of view.
>Nearly every 8bit character has a different representation on
>different systems with different character sets.
>
>A german umlaut "ä" happens to be 0xe4 in ISO-8859-P1 charset
>(also Latin-1, Windows) but is 0x8a in the Mac charset and
>certainly something weird in EBCDIC.
>
>But the whole point of MIME encoding is to transport such characters
>between different systems.

The whole point of MIME is to safely transport bytes with the high 
bit set through 7-bit transport mechanisms such as email.  So far as 
I know, there is no concept of transporting "characters between 
different systems".  It's a way of making sure that *bytes* arrive 
intact.

But if I ever read the RFCs, it was a long time ago, so I may be wrong.

>
>As far as I understand, to stick with my example, an "ä" has to be
>encoded into "=E4" (same as the HTML entity "ä").

To know that for sure, you're going to have to read the appropriate 
RFC's.  I thought both MIME and HTML require the use of a specific 
character set, one of the ISO's.  If the RFC's for MIME require 
character set conversion, then bring it up with the author of 
MIME::QuotedPrint.  Providing the author a patch would be the fastest 
way to fix this, if indeed it needs to be fixed.  I'm no expert on 
this stuff, but my recollection is that there's nothing to fix here.

>As in HTML it is the receivers task to decode into his local
>charset.
>
>If the MacPerl MIME::QuotedPrint doesn't transform a "ä" into
>a "=E4" I consider this a major malfunction.
>
>Another test case - I send the =8A to myself, reading with Mac
>Eudora, I see a "..." character, called a dieresis, option-.
>
>MacPerl's ord() function is not portable with 8bit characters.
>I have no idea whether this is known nor do I have a workaround yet.


Nonsense.  MacPerl's ord() function is doing exactly what you've told 
it to do.  It has no idea how a certain byte appears on your screen. 
Nor *can* it know, since on a Mac you can so easily change fonts.  If 
this really troubles you, start using an ISO font on your Mac.  IIRC 
several have been mentioned on this list in the past.

>
>
>Regards
>
>
>Axel
>
>
>  >#!perl -w
>  >
>  >use MIME::QuotedPrint;
>  >
>  >$string = "ä";
>  >$estring = encode_qp( $string );
>  >print "encode_qp( $string ) = $estring\n";
>  >
>  >__END__
>  >
>  >Result is
>  >encode_qp( ä ) = =8A
>  >
>  >Running under Linux I get the proper result
>  >encode_qp( ä ) = =E4
>  >
>  >
>  >Looking into the source code I find
>  >...
>  >$res =~ s/([^ \t\n!-<>-~])/sprintf("=%02X", ord($1))/eg;  # rule #2,#3
>  >$res =~ s/([ \t]+)$/
>  >	join('', map { sprintf("=%02X", ord($_)) }
>  >		split('', $1)
>  >      )/egm;                        # rule #3 (encode whitespace at eol)
>...
>
>----------------------------------------------------------------------
>Axel Rose, Springer & Jacoby Digital GmbH & Co. KG, mailto:rose@sj.com
>   pub PGP key 1024/A21CB825 E0E4 BC69 E001 96E9  2EFD 86CA 9CA1 AAC5
>   "... denn alles, was entsteht, ist wert, daß es zugrunde geht ..."

--
Paul Schinder
schinder@pobox.com

# ===== Want to unsubscribe from this list?
# ===== Send mail with body "unsubscribe" to macperl-request@macperl.org