[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

URI::Find (was Re: [FWP] R/E Question)

To: fwp@technofile.org
Subject: URI::Find (was Re: [FWP] R/E Question)
From: Michael G Schwern <schwern@pobox.com>
Date: Mon, 31 Jan 2000 17:11:07 -0500
Cc: bill@fccj.org, Uri Guttman <uri@sysarch.com>
In-Reply-To: <200001300407.XAA00528@home.sysarch.com.>; from uri@sysarch.com on Sat, Jan 29, 2000 at 11:07:40PM -0500
References: <B4B3486F.A88A%bill@fccj.org> <B4B8C836.AD2D%bill@fccj.org> <20000129222303.A27607@athens.aocn.com> <200001300407.XAA00528@home.sysarch.com.>

Hey, kids.  I cleaned up that URI finding mess I posted before into a
module and just sent it off to CPAN.  I don't know how many times I've
rewritten that thing, figured it was time to put it where it could do
some real damage.

NAME
         URI::Find - Find URIs in arbitrary text

SYNOPSIS
         use URI::Find;

         $how_many_found = find_uris($text, \&callback);

DESCRIPTION
       This module does one thing: Finds URIs and URLs in plain
       text.  It finds them quickly and it finds them all (or
       what URI::URL considers a URI to be.)


I'm rather happy with its ability to find URLs.  I scraped some nasty
ones off Dejanews and News.com.  Yes, it can actually pick up 'http://www.deja.com/%5BST_rn=ps%5D/qs.xp?ST=PS&svcclass=dnyr&QRY=lwall&defaultOp=AND&DBS=1&OP=dnquery.xp&LNG=ALL&subjects=&groups=&authors=&fromdate=&todate=&showsort=score&maxhits=25' as a URL.

However, I'm sure you guys can find plenty of things on which my
module will trip up.  I'll be throwing the Perl documentation at it
soon, always a good test.  If you find anything which URI::Find
doesn't properly find, please let me know!

PS  If your CPAN mirror hasn't gotten URI::Find yet it can be picked up
from http://www.pobox.com/~schwern/src/



-- 

Michael G Schwern                                           schwern@pobox.com
                    http://www.pobox.com/~schwern
     /(?:(?:(1)[.-]?)?\(?(\d{3})\)?[.-]?)?(\d{3})[.-]?(\d{4})(x\d+)?/i

==== Want to unsubscribe from Fun With Perl?  Well, if you insist...
==== Send email to <fwp-request@technofile.org> with message _body_
====   unsubscribe

References:
- Re: [FWP] Small fun with Perl
  - From: Bill Jones <bill@fccj.org>
- [FWP] R/E Question
  - From: Bill Jones <bill@fccj.org>
- Re: [FWP] R/E Question
  - From: Michael G Schwern <schwern@pobox.com>
- Re: [FWP] R/E Question
  - From: Uri Guttman <uri@sysarch.com>

Prev by Date: Re: [FWP] Copying hashes
Next by Date: Re: [FWP] R/E Question
Prev by thread: Re: [FWP] R/E Question
Next by thread: Re: [FWP] R/E Question
Navigation: Date Index | Thread Index | Search | Other lists at bumppo.net