[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

[MacPerl-Modules] XML::Parser: support for native Mac (Roman) character set



I have experimented with the XML::Parser module today. Much to my
surprise, it looks like it does NOT support the Mac's native character
set.

I'm going to try to fix that. Here are some links that I've collected
today. This looks like it's enough to make it work in minimal time.

A) There's the module XML::Encoding, see
<http://search.cpan.org/search?dist=XML-Encoding>. In this module there
are the XML files used to create the encoding files that were included
with XML::Parser. The format looks simple enough, e.g. the file
"windows-1250.xml".

B) There's the directory with conversion tables from the Mac character
sets to Unicode: <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/>.
The "standard" Mac character set for the western world is in
"ROMAN.TXT".

Now, some questions:

 - Provided I get the encoding table right, would there be any interest
in distributing it? Would there be any reason to maybe include them in
the XML::Parser package, possibly only the Mac version?

 - What would be the proper name for the table? I'm thinking of
"Mac-Roman" for the Mac.

 - Most importantly: once in place, can I use these to decode XML files
into ordinary Mac text? What would be the module's syntax? Can I encode
XML files so that they're flagged as using Macintosh specific text, not
just ASCII?

p.s. Oddly enough, there is no encoding table for the basic Windows
character set included either, which is an extension of ISO-Latin-1. The
codepage is 1252, not 1250.

-- 
	Bart.

==== Want to unsubscribe from this list?
==== Send mail with body "unsubscribe" to macperl-modules-request@macperl.org