[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

URI::Find (was Re: [FWP] R/E Question)



Hey, kids.  I cleaned up that URI finding mess I posted before into a
module and just sent it off to CPAN.  I don't know how many times I've
rewritten that thing, figured it was time to put it where it could do
some real damage.

NAME
         URI::Find - Find URIs in arbitrary text

SYNOPSIS
         use URI::Find;

         $how_many_found = find_uris($text, \&callback);

DESCRIPTION
       This module does one thing: Finds URIs and URLs in plain
       text.  It finds them quickly and it finds them all (or
       what URI::URL considers a URI to be.)


I'm rather happy with its ability to find URLs.  I scraped some nasty
ones off Dejanews and News.com.  Yes, it can actually pick up 'http://www.deja.com/%5BST_rn=ps%5D/qs.xp?ST=PS&svcclass=dnyr&QRY=lwall&defaultOp=AND&DBS=1&OP=dnquery.xp&LNG=ALL&subjects=&groups=&authors=&fromdate=&todate=&showsort=score&maxhits=25' as a URL.

However, I'm sure you guys can find plenty of things on which my
module will trip up.  I'll be throwing the Perl documentation at it
soon, always a good test.  If you find anything which URI::Find
doesn't properly find, please let me know!

PS  If your CPAN mirror hasn't gotten URI::Find yet it can be picked up
from http://www.pobox.com/~schwern/src/



-- 

Michael G Schwern                                           schwern@pobox.com
                    http://www.pobox.com/~schwern
     /(?:(?:(1)[.-]?)?\(?(\d{3})\)?[.-]?)?(\d{3})[.-]?(\d{4})(x\d+)?/i

==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe