[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] HTML Character Conversion



At 16.52 -0500 1998.12.20, Matt Henderson wrote:
>Hmmm..., well, I gave it a shot:
>
>     use HTML::Entities;
>     $output_line = encode_entities($input_line);
>
>This code translated the character  to the code , while the BBEdit
>Format tool translates the same character to the code ù.
>
>Only the latter code, ù renders the proper character in Netscape.
>The former code,  and the one returned by HTML::Entities renders as
>no specific character in Netscape.
>
>Interestingly, 157 is the ASCII decimal code for , while 249 is the HTML
>decimal code for the same character. I'm just learning Perl, so I
>couldn't fully follow the Entities.pm code that well, but is seems that
>the encoding uses the chr() function, which I suppose returns the decimal
>ASCII code for the character. How could it be modified to return the HTML
>code? (And why are they different?)

For the reasons explained previously in this thread.  There are three
different character sets we are dealing with.  One is HTML, one is
MacRoman, one is ISO.  HTML uses numbers similar to ISO;  in ISO is
decimal 249, and in HTML is 249.  In MacRoman, it is 157.  If you have an
ISO font, type option-backtick-u (157) and option-shift-. (249) and switch
between ISO and MacRoman fonts.

What's needed is a Mac<->ISO translation or Mac<->HTML translation table,
which is not difficult to do, but it has to be done manually first and then
can be used over and again.

--
Chris Nandor          mailto:pudge@pobox.com         http://pudge.net/
%PGPKey = ('B76E72AD', [1024, '0824090B CE73CA10  1FF77F13 8180B6B6'])

***** Want to unsubscribe from this list?
***** Send mail with body "unsubscribe" to mac-perl-request@iis.ee.ethz.ch