[Date Prev][Date Next][Thread Prev][Thread Next] [Search] [Date Index] [Thread Index]

Re: [MacPerl] Clarification of detecting "broken links"/invalidURLs



Charles Cave <charles@jolt.mpx.com.au> writes:
[snip]
}
}I received some useful code from several list members and it seems
}that the "Connect" call is needed to validate a domain name. What my
}program needs to do is to pretend to be a web browser and fetch
}a specified web document.

Then you want what Chris Nandor posted.  Use libwww-perl-5 to do a head request
on each link.  You can use HTML::LinkExtor, also a part of libwww-perl-5,
to extract the links from any HTML page.  The head request will not, as you
pointed out, fail on pages that are simple redirection pages, but some
things require human intervention.

}
}I hope this makes my "problem" clearer!   Coincidentally, I found
}a program called Big Brother which does one task: check external
}links in an HTML document!  I would prefer to use MacPerl and then
}I would learn something.

That would be foolish.  Big Brother does this job superbly well, and I've
abandoned using my MacPerl script that does this job in favor of Big
Brother.

}
}Charles
}
}
}Creativity Web: http://www.ozemail.com.au/~caveman/Creative
}
}
}------------------------------------------------------
}Charles Cave                         Sydney, Australia
}Email:                         charles@jolt.mpx.com.au
}------------------------------------------------------


--------
Paul J. Schinder
NASA Goddard Space Flight Center
Code 693
Greenbelt, MD 20770
schinder@pjstoaster.pg.md.us