In article <Pine.SOL.3.96.1001129151442.8368A-100000@simpukka>, Ilmari Karonen <iltzu@sci.fi> wrote: > > On Tue, 28 Nov 2000, Yitzchak Scott-Thoennes wrote: > > Knuth's _Art of Computer Programming,_ volume 3 details a hideously > complicated set of rules used by libraries to produce an intuitive (for > some values of the word, anyway) ordering of book titles. Implementing > that in Perl could be interesting, if I can only find a copy.. Yuck. > > $srt =~ tr/0-9a-z\xe9/a-jA-ZE/; # uc & sort nums after letters > > # MORE -- xlat more latin1 chars > > s/[\xc6\xe6]/AE/g; s/\xdf/SS/g; > tr/\xc0-\xdf/AAAAAA CEEEEIIIIDNOOOOO*OUUUUYT /; > tr/\xe0-\xff/AAAAAA CEEEEIIIIDNOOOOO:OUUUUYTY/; Thanks. I was hoping someone would take up the gauntlet. > But note that finns (and presumably swedes) would consider the order > produced by my code broken, since it treats A, D and E as equivalent. > Around here, the alphabet ends "VWXYZEDV", and anyone wanting to look up, > say, "Eland" in a multipart encyclopedia would reach for the last volume. I'm happy with this as is. > > $srt =~ s/\b'(?=[ST]\b)//g; # remove apostrophes > > I think that'll miss some apostrophes. Perhaps you've already though > about this and decided they're better treated as spaces, but personally > I'd broaden the regex a bit.. I think there was a FOO'N'BAR or something similar that I wanted as three separate words. ==== Want to unsubscribe from Fun With Perl? Well, if you insist... ==== Send email to <fwp-request@technofile.org> with message _body_ ==== unsubscribe