[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] HTML Entities decoding



At 16:52 -0500 1999.09.14, Matthew Langford wrote:
>I have an html file which I am parsing.  It has the HTML entity —
>which Netscape shows as "--", an em dash.  When I use
>HTML::Entities::decode, I get an o with a forward-slash top (love those
>technical descriptions, don't ya?.  I think it's the same letter as typing
>option-e, o on the Mac.

That is an o with an accent aigu on it (as opposed to the other direction,
which is an accent grave ... I forget the English words for them, I just
remember the French :).

>Can someone explain what is wrong?  It seems if Netscape knows how to
>translate properly, HTML::Entities should be able to, as well.  Is there
>another step that I'm missing?
>
>I would be grateful for the instruction.

It has to do with character encodings.  I'd bet that HTML::Entities is
using ISO Latin 1.  Most Mac fonts use MacRoman.  You'd need a
MacRoman<->Latin1 translation table.

-- 
Chris Nandor          mailto:pudge@pobox.com         http://pudge.net/
%PGPKey = ('B76E72AD', [1024, '0824090B CE73CA10  1FF77F13 8180B6B6'])

===== Want to unsubscribe from this list?
===== Send mail with body "unsubscribe" to macperl-request@macperl.org