This article will provide you with an overview of the cURL, known as curl, and its installation guidelines on your system. The rest would focus on its usage with proxies. So, as a prerequisite, it is essential to know how proxies function before commencing this tutorial.
cURL is a convenient library that allows you to send and receive data to and from websites and is a vital tool for your web scraping needs.
Without further ado, let’s get started with the fundamentals of cURL.
cURL is an abbreviation for client URL and a command-line tool to send and receive data from a server. It is distributed to modern Operating Systems, including Windows 10 and Linux distributions. Before looking at a simple example, let’s find out what you need to know in order to install it.
Installation guide- if you’re using a Windows Operating System before version 10, please follow its official installation page guidelines. Whereas if you’re using a Linux version, for instance, Ubuntu, open the terminal, and you need to run the command below:
sudo apt install curl.
Running a simple request-We hope you have installed cURL in your OS and are ready to give it a test. Let’s start with a simple example:
In Windows, open your terminal or command prompt and type:
On the console, it would print the HTML of the page.
cURL transfers data to and from web pages with the help of Internet Protocols. Although initially, cURL was developed to work with HTTP protocols, it currently supports many network protocols such as FTP, IMAP, IMAPS, SMTP, POP3, POP3S, and others.
It also supports POST, GET, PUT, and some of the other methods out there when sending requests. Let’s look at an example of sending some data with the post data.
curl -d "name=yourname&value=somevalue" https://examplewebsite.com/post"
The above piece of code -d denotes that you’re using the post method to pass your name and some value to the post page of examplewebsite.com.
Now you know what cURL is, and let’s move into its usage with proxies.
You can use proxies to connect with a website using cURL. For instance, proxies are essential in circumstances when you use cURL to scrape data. Then you remain anonymous to the target website that you’re scraping from.
To connect with proxies, you would need the proxy server address, port number, and protocol type, and if authentication is required, you would need to enter the username and password. Let’s look at a simple example: we assume the proxy address is 127.0.0.1 and the port number is 8920. The example mentioned below are fundamentals of connection proxies with cURL, which would work for any proxy service.
The syntax to connecting to a proxy would be:
curl --proxy proxyaddress: port https://examplewebsite.com
would replace with:
curl --proxy 127.0.0.1:8920 https://examplewebsite.com
The above command will route your connection via a proxy to examplewebsite.com.
Now we shall look into an example that requires authentication where username is username and password is password.
curl --proxy 127.0.0.1:8920 -U "username:password" https://examplewebsite.com
Now you can find out which commands to use when connecting cURL with a proxy protocol, using :
Undoubtedly it would return a huge list, and we would focus on the most fundamental command listed below:
-x, --proxy [protocol://]host[:port]
In this command, x along with –proxy denotes the proxy details, where you could use either of them as both are correct. However, be mindful that x is case-sensitive.
Also, to be sure that you’re using proxies, you could use the following command:
This command would usually return the IP address of the origin. So if you’re using a proxy server, it would return the IP address of the proxy server instead of yours.
So now, putting it all together, you could send the request as follows:
curl --proxy "http://username:email@example.com:8920" "http://httpbin.org/ip"
Also, the below command would be the same as above:
curl --x "http://username:firstname.lastname@example.org:8920" "http://httpbin.org/ip"
An important fact to keep in mind here is that you should use quotes for both the proxy URL and the target URL as best practice. It is due to the presence of special characters in the URL.
Also, if you get any SSL certificate errors, you need to add the lowercase -k to the end of the command as shown below:
curl --proxy "http://username:email@example.com:8920" "http://httpbin.org/ip" -k.
This will allow insecure connections to pass through when using the SSL connection.
When using proxies, the default protocol is HTTP unless otherwise explicitly specified. Therefore both the below commands are correct:
curl --proxy "http://username:firstname.lastname@example.org:8920" "http://httpbin.org/ip". curl --proxy "username:email@example.com:8920" "http://httpbin.org/ip".
If you wish to have a proxy for cURL, you can create a curl-config file in the following manner.
If you’re on macOS or Linux, first of all, you have to open the terminal and go to your home directory. If there is a .curlrc, you need to open it and create a new empty file. You can use the below commands to navigate to the file:
cd~ nano .curlrc.
So then you need to add this line in the file:
Save the file, and now you can use the cURL with proxies. Simply you have to run the cURL normally, and it will read the proxy from the above file:
In Windows-this file will be _curlc, and it is placed in the %APPDATA% directory. To find the exact path for the %APPDATA%, you would require to type the following command in the command prompt:
So the above command will return the path, and you have to navigate to it. Then you need to create _curlrc file and set the proxy the same as with macOS or Linux.
Now you have a comprehensive knowledge of using the cURL with proxies. A cURL is a powerful tool, and for a project such as web scraping, it would be ideal to use with proxies. We hope that you will put into practice what you have learned in this article. Furthermore, Proxyscrape would be delighted to help you out with your proxy requirements.
You can read more good articles on our blog.
Comments are closed.