[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] Character translation problem (resend)



[I should learn to preafrood my postings before I send them; sigh!]

> That one's over my head.

OK, a bit of background.  The original ASCII character set had only
128 characters, corresponding to the values 0 - 127 (00 - 7F in hex;
0000 0000 - 0111 1111 in binary).  The 8th (high-order) bit of each
byte, valued 128 (80 in hex; 1000 0000 in binary) was reserved for
"parity checking", a way to ensure that transmission was accurate.

Here is a listing of this ASCII character set, generated by typing
"man ascii" on my FreeBSD machine:

...
    The hexadecimal set:

    00 nul   01 soh   02 stx   03 etx   04 eot   05 enq   06 ack   07 bel
    08 bs    09 ht    0a nl    0b vt    0c np    0d cr    0e so    0f si
    10 dle   11 dc1   12 dc2   13 dc3   14 dc4   15 nak   16 syn   17 etb
    18 can   19 em    1a sub   1b esc   1c fs    1d gs    1e rs    1f us
    20 sp    21  !    22  "    23  #    24  $    25  %    26  &    27  '
    28  (    29  )    2a  *    2b  +    2c  ,    2d  -    2e  .    2f  /
    30  0    31  1    32  2    33  3    34  4    35  5    36  6    37  7
    38  8    39  9    3a  :    3b  ;    3c  <    3d  =    3e  >    3f  ?
    40  @    41  A    42  B    43  C    44  D    45  E    46  F    47  G
    48  H    49  I    4a  J    4b  K    4c  L    4d  M    4e  N    4f  O
    50  P    51  Q    52  R    53  S    54  T    55  U    56  V    57  W
    58  X    59  Y    5a  Z    5b  [    5c  \    5d  ]    5e  ^    5f  _
    60  `    61  a    62  b    63  c    64  d    65  e    66  f    67  g
    68  h    69  i    6a  j    6b  k    6c  l    6d  m    6e  n    6f  o
    70  p    71  q    72  r    73  s    74  t    75  u    76  v    77  w
    78  x    79  y    7a  z    7b  {    7c  |    7d  }    7e  ~    7f del
...

Somewhat later, computer vendors decided to use full 8-bit bytes, using
a 9th bit (if need be) for parity.  This freed up the 8th bit for use
as data, so some vendors started encoding oddball characters in the
range 128 - 255 (80 - FF in hex; 1000 0000 - 1111 1111 in binary).

Sadly, these vendors didn't _agree_ with each other about the 
character assignments, so the assorted (Apple, Microsoft, etc.) 
encodings are NOT
cross-compatible, by and large.

So much for background.  You say you got the same results I got, as:

> e2: ’
> 27: '
> 62: b

IF you are getting exactly this, you should be able to use the code from
the $x2 case and clean up your data.  How exactly does this fail?

-r
--
Rich Morin:          rdm@cfcl.com, +1 650-873-7841, http://www.ptf.com/~rdm
Prime Time Freeware: info@ptf.com, +1 408-433-9662, http://www.ptf.com
MacPerl: http://www.macperl.com,       http://www.ptf.com/ptf/products/MPPE
MkLinux: http://www.mklinux.apple.com, http://www.ptf.com/ptf/products/MKLP

===== Want to unsubscribe from this list?
===== Send mail with body "unsubscribe" to macperl-request@macperl.org