At 14:26 11/04/96 -0400, "Stephane Jose" <jose.stephane@uqam.ca> wrote: >I am setting up a cgi with MacPerl that allows me to return a web page of >info on a particular city by browsing a text based database (tab separated >fields, return separated records). I request the name of that city from a >form. Nothing fancy. My script works fine when I request unaccented data >(ie. 'Verdun' or 'Yamaska'). But when I submit a city name with accented >chars the mess begins. > >Is there a way to deal consistently with accented data submitted from a >form, independantly from the platform from which it was sent? I've not seen any replies to this post, so I thought I'd jump in. Perl does *not* have any troubles parsing accented characters. The problem here, is that only the standard ASCII set is platform independent. That is: from space, chr(32) to chr(126). Below that are the control characters. These are mostly portable (including tab, chr(9)) with one important difference: line terminations, "\n". This means different things for Unix: chr(10), Mac: chr(13), PC: chr(13)+chr(10). But you can easily work around that. A bigger difference is those accented characters you're talking about. These are *not* part of the standard ASCII set, and have codes between 128 and 255. I know of 4 platforms: Mac, PC DOS (OEM), PC Windows (ANSI), Unix (probably ANSI as well). Each has it's own "standard". In fact, in Perl you can easily convert from one platform to another using a single command like tr/\200-\377/ .... /; where the .... 's are replaced by a list of 128 characters, the translation table. If anyone's interested, I can post my tables for DOS>MAC and ANSI>MAC. But this isn't too relevant here. If I understand correctly, you want to return data in a HTML form, to the user? HTML has it's own standard way of dealing ith this. You need to use special HTML code strings, instead of accented characters. As an example, an "Ž" (that's an "e" with and "accent aigue" on it, must be included as "é". So you need a translation table, and a lot of lines like this: s/Ž/é/g; You can probably get a table on the net, in documentation about html. That's better than in a book, because you can simply incorporate it into your script (maybe write a Perl script?). I would think it's best to use a table like: %htmlised={'Ž','é', ...}; and foreach $key (keys(%htmlised)) { s/$key/$htmlised{$key}/g; } The table %htmlised could well be built from a text table, with lines for every translation, key + value on one line, with a tab between them. Now that you know *what* to do, the question is: when? You could store the table of cities as it is now, and convert every line on the fly, *every time* a html document is generated. Or, my suggestion, create a html-ised version of the table once, and simply incorporate the results into your html document without any further cnversion, as you create it. The disadvantage is that you can only see the full table of cities as it should look, using a web browser. So keep an "original" that you can edit, and a html-ised version to be used by your client program. Bart --- Embracing the KIS principle: Keep It Simple