[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [FWP] sorting text in human-order



In article <Pine.SOL.3.96.1001129151442.8368A-100000@simpukka>,
Ilmari Karonen <iltzu@sci.fi> wrote:
> 
> On Tue, 28 Nov 2000, Yitzchak Scott-Thoennes wrote:
> 
> Knuth's _Art of Computer Programming,_ volume 3 details a hideously
> complicated set of rules used by libraries to produce an intuitive (for
> some values of the word, anyway) ordering of book titles.  Implementing
> that in Perl could be interesting, if I can only find a copy..

Yuck.

> >       $srt =~ tr/0-9a-z\xe9/a-jA-ZE/;  # uc & sort nums after letters
> >                                        # MORE -- xlat more latin1 chars
> 
>   s/[\xc6\xe6]/AE/g; s/\xdf/SS/g;
>   tr/\xc0-\xdf/AAAAAA CEEEEIIIIDNOOOOO*OUUUUYT /;
>   tr/\xe0-\xff/AAAAAA CEEEEIIIIDNOOOOO:OUUUUYTY/;

Thanks.  I was hoping someone would take up the gauntlet.

> But note that finns (and presumably swedes) would consider the order
> produced by my code broken, since it treats A, D and E as equivalent.
> Around here, the alphabet ends "VWXYZEDV", and anyone wanting to look up,
> say, "Eland" in a multipart encyclopedia would reach for the last volume. 

I'm happy with this as is.

> >       $srt =~ s/\b'(?=[ST]\b)//g; # remove apostrophes
> 
> I think that'll miss some apostrophes.  Perhaps you've already though
> about this and decided they're better treated as spaces, but personally
> I'd broaden the regex a bit..

I think there was a FOO'N'BAR or something similar that I wanted as
three separate words.

==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe