Ilmari Karonen <iltzu@sci.fi> writes: > On Tue, 28 Nov 2000, Yitzchak Scott-Thoennes wrote: > > I just wrote a script to help my wife keep her bookmarks sorted. In > > the process, I found that how she was sorting them by hand wasn't > > anything like a simple C<cmp>. By trial and error, I came up with the > > following sort key calculation to mimic her idea of a "natural" sort > > order. > > Knuth's _Art of Computer Programming,_ volume 3 details a hideously > complicated set of rules used by libraries to produce an intuitive (for > some values of the word, anyway) ordering of book titles. Implementing > that in Perl could be interesting, if I can only find a copy.. I think that'd require strong AI wouldn't it? Some of the rules are splendid: '1066 et tout la' would get sorted as 'mille et soixante six ans et tout la', whereas '1066 and all that would be sorted as 'ten sixty six and all that'. And, just to make things even more confusing something like '1066 years of solitude' would get sorted as 'one thousand and sixty six years of solitude'. And that's just the numbers. Then you've got the rules for weird stuff like: 'Tom Jones' gets sorted with the Ts, whereas the hypothetical 'Tom Jones, the non fictional character, a biography' would get sorted under 'Jones, Tom'... It all gets very scary. Of course, if you're actually sorting bibliographic data then you will hopefully have more data to go on beyond just the title and author, and you could (say) sort biographies with different rules from novels, but even so, it gets painful. Sadly I don't actually have Knuth to hand (it's at home), but the full list is quite scary. ISTR that his response to the problem is to not even try to solve it. Solving it would be *good* though. -- Piers ==== Want to unsubscribe from Fun With Perl? Well, if you insist... ==== Send email to <fwp-request@technofile.org> with message _body_ ==== unsubscribe