[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] UTF8 conversion



At 10:13 am +0000 21/5/00, Bart Lateur wrote:

>>On Sun, 14 May 2000 20:57:56 -0300, Arved Sandstrom wrote:
>>
>>At 01:32 PM 5/14/00 -0400, M. Christian Hanson wrote:
>>>I am taking the output of the XML parser in macperl and want to push
>>>it into a text processor that is really itching for Latin-1 not the
>>>UTF8 that the xml parser hands me.  Any body have any advice?
>
>>You can check the archives for this list, for one. This just came up
>>recently. Bart Lateur, if I recall correctly, has looked at this.

>Says the guy who ported the Unicode::String module to MacPerl. See
><http://pudge.net/cgi-bin/mmp.plx>.
>
>It's time that I put my code where my mouth is. I had only mentioned
>that I got a solution, but I'm still having big problem with finalizing
>it. There are so many ways to do that, and none looks ideal. Worse: the
>code gets 5 times more complicated, than the bare essence of it.


I know it's not very perly, but why not take advantage of Apple's 
built-in Text Encoding Converter and the TEC PPC Osax and just do the 
whole thing with Apple Events?  This makes possible all conversions 
including Chinese, Japanese etc.  The alternative is to use tables, 
as I do in Frontier when converting a limited number of special 
characters to UTF* for rendering sites, but this is limited and would 
be impractical for a general solution.

The TEC PPC osax
<http://www.bekkoame.or.jp/~iimori/sw/TECOSAX.html>
makes a breeze of the whole thing.

on reencode(s, x, y)
   TECConvertText s fromCode x toCode y
end reencode
set m to ""
set l1 to reencode(m, "macintosh", "iso-8859-1")
--"?"
set utf7 to reencode(l1, "iso-8859-1", "UNICODE-2-0-UTF-7")
-- +AOUA3wDn?+AOo-
set utf8 to reencode(m, "macintosh", "UTF-8")
-- ߒLj

JD


# ===== Want to unsubscribe from this list?
# ===== Send mail with body "unsubscribe" to macperl-request@macperl.org