On Tue, 28 Nov 2000, Yitzchak Scott-Thoennes wrote: > I just wrote a script to help my wife keep her bookmarks sorted. In > the process, I found that how she was sorting them by hand wasn't > anything like a simple C<cmp>. By trial and error, I came up with the > following sort key calculation to mimic her idea of a "natural" sort > order. Knuth's _Art of Computer Programming,_ volume 3 details a hideously complicated set of rules used by libraries to produce an intuitive (for some values of the word, anyway) ordering of book titles. Implementing that in Perl could be interesting, if I can only find a copy.. > $srt =~ tr/0-9a-z\xe9/a-jA-ZE/; # uc & sort nums after letters > # MORE -- xlat more latin1 chars s/[\xc6\xe6]/AE/g; s/\xdf/SS/g; tr/\xc0-\xdf/AAAAAA CEEEEIIIIDNOOOOO*OUUUUYT /; tr/\xe0-\xff/AAAAAA CEEEEIIIIDNOOOOO:OUUUUYTY/; But note that finns (and presumably swedes) would consider the order produced by my code broken, since it treats A, Ä and Å as equivalent. Around here, the alphabet ends "VWXYZÅÄÖ", and anyone wanting to look up, say, "Åland" in a multipart encyclopedia would reach for the last volume. > $srt =~ s/\b'(?=[ST]\b)//g; # remove apostrophes I think that'll miss some apostrophes. Perhaps you've already though about this and decided they're better treated as spaces, but personally I'd broaden the regex a bit.. -- Ilmari Karonen - http://www.sci.fi/~iltzu/ "Giant fruitflies, on the other hand, are weird by anybody's standards." -- Holly E. Ordway in rec.arts.sf.composition ==== Want to unsubscribe from Fun With Perl? Well, if you insist... ==== Send email to <fwp-request@technofile.org> with message _body_ ==== unsubscribe