Wget
wget to console
Use the -O -
option to output contents to the console.
$ wget -O - http://google.ca
Since the contents go to stdout
, you can pipe it to a shell to execute remote commands:
$ wget -O - http://scripthost.local/fix_graphics.sh | sh
Note: This is equivalent to using curl with no options. Eg: curl http://google.ca
.
Spoofing Gootlebot
Some sites contain a paywall but can be viewed if spoofing Googlebot. To spoof Googlebot, pass in a Googlebot user agent and optionally a X-Forwarded-For
header from a Googlebot IP address.
$ wget --header="X-Forwarded-For: 66.249.66.1" --user-agent="Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
Save to Directory
Use the -P
to specify the prefix of the output file.
$ wget -P /tmp http://google.ca/
Saving an Entire Open Directory
To download an entire open directory from a remote server:
$ wget --no-check-certificate -np -nH -r -c --reject "index.html*" -e robots=off http://server/remote/dir
Where:
Flag | Description |
---|---|
--no-check-certificate
|
Don't check SSL certificates |
-np
|
No parent directory (don't crawl up) |
-nH
|
No host directory |
--cut-dirs=1
|
Remove the 'remote/' directory from the download destination, 1 level. |
-r
|
Recursive |
-c
|
Continue |
--reject "index.html*"
|
Don't save index.html* files from directory listing |
-e robots=off
|
Don't check robots.txt |
wget on FreeBSD
wget is available in ports under ftp/wget
. If you don't want to compile it, you could just use fetch (1)
to download files.