On Fri, 11 Jun 1999, John Porter wrote: > > you can use Hrvoje Niksic's utility "wget" and perl? > > (wget is avaliable as a debian GNU/Linux package) > > Debian-specific/only? Pretty useless, in that case. It's not Debian-specific, but there's a nice easy-to-install package for Debian (like with most programs). He's just advocating, 's all. > I confess I'm not familiar with the workings of wget; > please enlighten as to how it differs from GET, which comes with LWP. I don't think GET grabs sites recursively. wget has a lot more features, in general. Here's the output of wget --help: GNU Wget 1.5.3, a non-interactive network retriever. Usage: wget [OPTION]... [URL]... Mandatory arguments to long options are mandatory for short options too. Startup: -V, --version display the version of Wget and exit. -h, --help print this help. -b, --background go to background after startup. -e, --execute=COMMAND execute a `.wgetrc' command. Logging and input file: -o, --output-file=FILE log messages to FILE. -a, --append-output=FILE append messages to FILE. -d, --debug print debug output. -q, --quiet quiet (no output). -v, --verbose be verbose (this is the default). -nv, --non-verbose turn off verboseness, without being quiet. -i, --input-file=FILE read URL-s from file. -F, --force-html treat input file as HTML. Download: -t, --tries=NUMBER set number of retries to NUMBER (0 unlimits). -O --output-document=FILE write documents to FILE. -nc, --no-clobber don't clobber existing files. -c, --continue restart getting an existing file. --dot-style=STYLE set retrieval display style. -N, --timestamping don't retrieve files if older than local. -S, --server-response print server response. --spider don't download anything. -T, --timeout=SECONDS set the read timeout to SECONDS. -w, --wait=SECONDS wait SECONDS between retrievals. -Y, --proxy=on/off turn proxy on or off. -Q, --quota=NUMBER set retrieval quota to NUMBER. Directories: -nd --no-directories don't create directories. -x, --force-directories force creation of directories. -nH, --no-host-directories don't create host directories. -P, --directory-prefix=PREFIX save files to PREFIX/... --cut-dirs=NUMBER ignore NUMBER remote directory components. HTTP options: --http-user=USER set http user to USER. --http-passwd=PASS set http password to PASS. -C, --cache=on/off (dis)allow server-cached data (normally allowed). --ignore-length ignore `Content-Length' header field. --header=STRING insert STRING among the headers. --proxy-user=USER set USER as proxy username. --proxy-passwd=PASS set PASS as proxy password. -s, --save-headers save the HTTP headers to file. -U, --user-agent=AGENT identify as AGENT instead of Wget/VERSION. FTP options: --retr-symlinks retrieve FTP symbolic links. -g, --glob=on/off turn file name globbing on or off. --passive-ftp use the "passive" transfer mode. Recursive retrieval: -r, --recursive recursive web-suck -- use with care!. -l, --level=NUMBER maximum recursion depth (0 to unlimit). --delete-after delete downloaded files. -k, --convert-links convert non-relative links to relative. -m, --mirror turn on options suitable for mirroring. -nr, --dont-remove-listing don't remove `.listing' files. Recursive accept/reject: -A, --accept=LIST list of accepted extensions. -R, --reject=LIST list of rejected extensions. -D, --domains=LIST list of accepted domains. --exclude-domains=LIST comma-separated list of rejected domains. -L, --relative follow relative links only. --follow-ftp follow FTP links from HTML documents. -H, --span-hosts go to foreign hosts when recursive. -I, --include-directories=LIST list of allowed directories. -X, --exclude-directories=LIST list of excluded directories. -nh, --no-host-lookup don't DNS-lookup hosts. -np, --no-parent don't ascend to the parent directory. Mail bug reports and suggestions to <bug-wget@gnu.org>. ==== Want to unsubscribe from this list? (Don't you love us anymore?) ==== Well, if you insist... Send mail with body "unsubscribe" to ==== fwp-request@technofile.org