[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] hi-bit characters in regex's



On Sat, 11 Mar 2000 19:40 - Kevin van Haaren <kevinv@hockey.net> wrote:

Snip........
>Anyway to make a long story short, I discovered that both mac and
>windows allow high-order characters in the filenames (discovered this
>on the Husker Du album, the u's have those 2 little dot's over them).
>Does anyone know how I can test for these characters in a regex?  Is
>there a standard octal code for these characters (I always thought
>they were font specific)?
>
>The only characters I know for sure that work on both sides are the
>u's with 2 dots and the o with the ' over it.  My apologies for not
>knowing the proper names for these characters, i'm betraying my poor
>education (hey I took latin in high school, sorry).

For characters specified in octal

=F3 =3D octal code 227
=F2 =3D octal code 230
=FC =3D octal code 237

Use the Octal escape \num
a backslashed two or three digit octal number matches the character
with the specified value

For characters specified in hexadecimal

=F3 =3D hex code 97
=F2 =3D hex code 98
=FC =3D hex code 9f

Use the Hexadecimal escape \xnum
a backslashed x followed by one or two hexadecimal digits matches the chara=
cter
with that hexadecimal value.

For your Husker D=FC album you might consider:

the latin-1 (iso-8859-1) encoding popular on the web so...
to match any of the lowercase u's =3D [u\xf9-\xfc]

HTH

David in Maine



--
David                   Northwestern Wilderness Of Maine Personal Essays
<eldorado@ime.net>      <http://w3.ime.net/~eldorado>
--



# =3D=3D=3D=3D=3D Want to unsubscribe from this list?
# =3D=3D=3D=3D=3D Send mail with body "unsubscribe" to macperl-request@macp=
erl.org