D&C GLug - Home Page

[ Date Index ] [ Thread Index ] [ <= Previous by date / thread ] [ Next by date / thread => ]

[LUG] Trying to download multiple pdf files using wget and wildcards

 

I have been trying to download a directory of pdf files using wget, and have been 
failing

I have tried the following commands:

wget --spider directory
wget -r directory
wget directory/*
wget -A pdf directory/*
wget --no-glob
wget --retr-symlinks directory/

And I think every combination of the above

The actual directory I eventually downloaded file by file is from the website (and 
wget was quicker than using firefox)
    http://www.hutchison-whampoa.com/eng/investor/annual/annual.htm

and the files were in the following directory
    http://202.66.146.82/listco/hk/hutchison/annual/2007

I think the problem is that I could not get a file list for the directory (see
http://www.webmasterworld.com/forum40/1694.htm
http://ubuntuforums.org/showthread.php?t=638362)

As I suspect that I will be trying to download multiple pdfs again: can anyone point 
me in the right direction of how I can do this?

Many thanks



-- 
Henry
Photocopies or faxes of my signature are not binding. Electronic documents 
(including email) are binding if digitally signed and appropriately verified
Digital Key Signature: GPG RSA 0xFB447AA1 
Mon Oct 13 11:55:55 BST 2008

Attachment: signature.asc
Description: Digital signature

-- 
The Mailing List for the Devon & Cornwall LUG
http://mailman.dclug.org.uk/listinfo/list
FAQ: http://www.dcglug.org.uk/linux_adm/list-faq.html