[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [FWP] See the the message body



On Fri, Mar 17, 2000 at 05:09:08PM -0800, Yitzchak Scott-Thoennes wrote:
> Let's fix that.  Tom C, not that long ago, posted a script to find all
> the cases of repeated words in the core pods (including, if I remember
> right, "the L<link>" or "L<link> manpage"--or was that a separate
> script?).
> 
> Can anyone improve (in any sense of the word) on it?
> (The basic algorithm, not necessarily the same functionality).

I dunno, what more do you need than:

for( $last_word = '';  ($word) = $text =~ /(\w+)/g;  $last_word = $word ) {
        print "Repeated word $word" if $last_word eq $word;
}

Could improve the basic test to get into something like "check if the
word roots match" using Lingua::EN::Stem.  And throw in a special case
for "L<...> manpage" and "the L<...>".

-- 

Michael G Schwern      http://www.pobox.com/~schwern/	   schwern@pobox.com
BOFH excuse #19:
 
 floating point processor overflow

==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe