Posted in

How to Use `wget` Command to Download Files from CLI

How to Use `wget` Command to Download Files from CLI
How to Use `wget` Command to Download Files from CLI

The wget command is a non-interactive network downloader, a fundamental utility for any Linux user or administrator. It allows you to retrieve files from web and FTP servers directly from the command line, even in the background. Mastering wget is crucial for scripting, automated downloads, and managing remote servers without a graphical interface. This guide will critically examine its core functionalities, providing precise, actionable steps to leverage its power for various download scenarios.

Prerequisites

To follow this guide, you will need:

  • A Linux-based operating system (e.g., Ubuntu, CentOS, Fedora).
  • Basic familiarity with the command line interface (CLI).
  • An active internet connection to download files.

1. Install wget (if necessary)

Before proceeding, ensure wget is installed on your system. While often pre-installed, some minimal distributions may require manual installation. Verify its presence and install if missing.

Verify Installation

Open your terminal and execute:

wget --version

If wget is installed, this command will display its version information. If not, you will receive an error indicating the command is not found.

Install on Debian/Ubuntu

sudo apt update
sudo apt install wget

Install on CentOS/RHEL/Fedora

sudo yum install wget

Pro-tip: Always update your package lists before installing new software to ensure you get the latest stable version and avoid dependency issues.

2. Download a Single File

The most basic function of wget is to download a single file from a specified URL. This operation is straightforward and serves as the foundation for more complex tasks.

Execute a Simple Download

To download a file, simply provide the URL as an argument:

wget http://example.com/path/to/yourfile.zip

The file yourfile.zip will be saved in your current working directory. wget will display progress, including download speed and estimated time remaining.

Warning: Ensure the URL is correct and points directly to the file. Incorrect URLs will result in HTTP errors (e.g., 404 Not Found).

Practical Tip: Silent and Background Downloads

For scripting or long downloads, you might prefer less output or background execution:

  • Silent download: Use the -q (quiet) option to suppress most output.
  • Background download: Use the -b option to send wget to the background immediately after startup. Output will be redirected to wget-log.
wget -q http://example.com/largefile.tar.gz
wget -b http://example.com/another_largefile.iso

3. Download to a Specific Directory or Rename File

Controlling where a file is saved and what it’s named upon download is often critical for organization and avoiding conflicts.

Save to a Different Directory

Use the -P (--directory-prefix) option to specify an output directory:

wget -P /var/www/html/downloads http://example.com/new_app.zip

This command will save new_app.zip inside /var/www/html/downloads/.

Rename the Downloaded File

Use the -O (--output-document) option to specify a different filename for the downloaded content:

wget -O latest_release.zip http://example.com/downloads/app_v2.3.zip

Here, app_v2.3.zip will be saved as latest_release.zip in the current directory.

Pro-tip: When using -O, wget will overwrite an existing file with the same name without warning. Exercise caution to prevent accidental data loss.

4. Resume Interrupted Downloads

For large files or unreliable network connections, downloads can be interrupted. wget provides a robust mechanism to resume these partial downloads, saving time and bandwidth.

Continue a Partial Download

Use the -c (--continue) option to resume a download:

wget -c http://example.com/very_large_archive.iso

If very_large_archive.iso already exists partially in the current directory, wget will attempt to continue downloading from where it left off. This relies on the server supporting HTTP range requests.

Warning: If the local file is corrupt or a different version, -c might result in a corrupted final file. In such cases, delete the partial file and restart the download.

5. Download Recursively (Website Mirroring)

wget can traverse links and download entire directory structures or even mirror entire websites, making it a powerful tool for offline browsing or backup.

Mirror a Directory Structure

To download a directory and its subdirectories, use the -r (--recursive) option:

wget -r http://example.com/docs/

This will download all files and subdirectories found under http://example.com/docs/.

Refine Recursive Downloads

Combine -r with other options for more control:

  • -np (--no-parent): Do not ascend to the parent directory. Crucial for staying within a specific path.
  • --level=N: Specify the maximum recursion depth.
  • -R (--reject): Exclude specific file types (e.g., -R gif,jpg).
wget -r -np --level=1 -R pdf http://example.com/resources/

This command downloads only the immediate contents of /resources/, excluding PDF files, and does not navigate to example.com/.

Critical Warning: Recursive downloads, especially without proper depth limits or exclusion rules, can place significant load on the target server. Always use them responsibly and check the website’s robots.txt file or terms of service.

6. Download Multiple Files from a List

When you have numerous files to download, listing them in a file and feeding that file to wget is far more efficient than individual commands.

Create an Input File

Create a plain text file (e.g., filelist.txt) with one URL per line:

http://example.com/file1.zip
http://example.com/image.jpg
ftp://ftp.example.org/data.tar.gz

Download Using the List

Use the -i (--input-file) option:

wget -i filelist.txt

wget will process each URL in the file, downloading them sequentially.

Pro-tip: Combine with -b for background processing of large lists.

7. Authenticate for Downloads

Many resources require authentication. wget supports HTTP and FTP authentication to access protected files.

HTTP Basic Authentication

Use --http-user and --http-password:

wget --http-user=myuser --http-password=mypassword http://example.com/private/document.pdf

FTP Authentication

Use --ftp-user and --ftp-password (or simply --user and --password for both protocols):

wget --user=ftpuser --password=ftppass ftp://ftp.example.com/protected/archive.tgz

Security Warning: Including credentials directly in the command line can expose them in your shell history. For sensitive operations, consider using a configuration file or interactive prompts where available, though wget does not directly support interactive password prompts.

8. Limit Download Speed

To avoid saturating your network connection or overwhelming a remote server, wget can limit its download speed.

Set a Download Rate Limit

Use the --limit-rate option, specifying the rate in bytes (e.g., 100k for 100 KB/s, 1m for 1 MB/s):

wget --limit-rate=500k http://example.com/huge_update.bin

This command will cap the download speed for huge_update.bin at 500 kilobytes per second.

Practical Tip: This is invaluable when running downloads on shared networks or servers where you must conserve bandwidth for other critical operations.

Next, explore man wget to uncover its myriad other options, including proxy settings, cookie handling, and advanced logging, to further refine your command-line download capabilities.

Leave a Reply

Your email address will not be published. Required fields are marked *