At 16:52 -0500 9/14/99, Matthew Langford wrote: >I have an html file which I am parsing. It has the HTML entity — >which Netscape shows as "--", an em dash. When I use >HTML::Entities::decode, I get an o with a forward-slash top (love those >technical descriptions, don't ya?. I think it's the same letter as typing >option-e, o on the Mac. > >Can someone explain what is wrong? It seems if Netscape knows how to >translate properly, HTML::Entities should be able to, as well. Is there >another step that I'm missing? Reference: HTML The Definitive Guide July 1996 edition. Used because it was handy. — is "Nonstandard" (and should be an en dash, with – being an em dash). In HTML 4, en-dash is – (or, better, –) and em-dash is — (or, better, —). I don't know how far back those go. See <http://w3c.org/TR/REC-html40/sgml/entities.html> then search for "dash" The numeric entities from 128 through 159 remain nonstandard (the easy thing to do is just covert them to the byte of that value, and since they are nonstandard, that can't be "wrong," just not very useful). --John -- John Baxter jwblist@olympus.net Port Ludlow, WA, USA ===== Want to unsubscribe from this list? ===== Send mail with body "unsubscribe" to macperl-request@macperl.org