LinkedIn provides the perfect social networking platform for professionals with 660 million users out of 303 million active monthly users. So If you haven’t already created a LinkedIn account, do it today. You can rub elbows with industry giants.
In this article, we’ll talk about how to scrape emails from LinkedIn accounts. Often you may need email addresses for the recruitment processes that your company often employs or for non-intrusive ad campaigns.
However, most users would hide their email addresses for privacy concerns. LinkedIn also provides mechanisms to mask such email addresses from public view. So in such circumstances, the only way to access email addresses is through scraping.
So without further ado, let’s find how you could carry out email scraping with LinkedIn.But first, let’s also find out why you need to scrape from LinkedIn.
As mentioned in the introductory paragraph, you may need it for recruitment purposes or marketing. To elaborate it a bit further, a user profile in LinkedIn has names, email addresses, competent skills, professional experience, qualifications, etc. On the other hand, company profiles have job postings, number of employees, current employees, and various other vital data.
So LinkedIn has a wealth of information that will be immensely beneficial to people.
Some users may use bots and crawlers to scrape emails and build an email list. Then they will sell these email lists to marketers and other groups who are keenly interested in these data.
Having said all that, you have to consider the ethical aspects of email scraping as well. Usually, it is regarded as unethical even if the intentions are non-malicious. However, the effectiveness of LinkedIn email scraping to build professional relationships can not be overlooked.
So the next section will focus on the legality of email scraping from LinkedIn’s point of view with an example.
The answer is a big no-no so that you get the message straight. LinkedIn’s documentation on “Prohibited Software and Extensions” strictly prohibits using crawlers, bots, robots, scripts, and any other add-ons or plugins to scrape the LinkedIn website. You can read further about the scrapes usage with LinkedIn from the above link to get a glimpse of it.
LinkedIn has enforced most of these rules to protect its member’s privacy. However, there are grey areas in some of these anti-scraping laws. I say so because sometime back, LinkedIn sued 100 anonymous scrapers for scraping data from LinkedIn.But still, no verdict has been given to the case. This is also because LinkedIn has failed to distinguish good scraping from destructive scraping.
The above case has resulted in bringing up critical issues in scraping. However, they’re beyond the scope of this article. So what I am trying to elaborate on here is that if you intend to scrape data from LinkedIn, you need to be aware that they dislike it. Therefore you have to do it right, which you will discover in the forthcoming sections.
In order to scrape emails the right way, you need to consider several factors. Some of these critical factors are:
So in the next section, we’ll look into the safest and most legitimate method of scraping email addresses from LinkedIn.
This is the safest and legitimate way to scrape emails from LinkedIn.
Before educating you on the steps for the manual export, a word of caution on this method; there is a new privacy setting in LinkedIn that allows only privileged access to a user’s email IDs.By default, LinkedIn has set this to “strong privacy.” However, you could change this setting to the “weaker” option. By doing so you are at the mercy of hackers who may use your email address for malicious acts.
So with this method, you would be able to download the email address of your direct contacts only. That too, you’re limited to downloading emails who changed their default privacy setting to “weaker”.
Anyhow following are the steps that you need to follow to download the emails manually:
Then you will receive an email that will have a link on where to extract the data you requested.
As you just saw above, using the manual method, you”ll get only a handful of results. Therefore you need to use an alternate method in the form of automation tools. These automation tools are crawling applications meant for LinkedIn alone. Let’s find out a few of them.
PhantomBuster -also called LinkedIn profile scraper and is a cloud-based application. HR managers and recruiters often use it to extract valuable data such as email addresses from prospective candidates or employees. If the target profiles are “direct connections,” you can extract the details easily. However, for “indirect connections,” extractions of emails become tricky. You can find further information about this tool here.
Octoparse– This is a very clever web scraping tool as it uses just three steps to collect data. It first finds the data, selects the data, and then exports it. It provides multiple options for saving data, either in CSV, XLSX, or different platforms using the API key. With it, you can either use proxy servers to mask your IP addresses or use an automated IP rotation to avoid a LinkedIn ban. Read more to know about the Octoparse better.
SalesQL- This tool is free and is an extension of the Google Chrome browser. So you can easily add it to your Chrome browser. Sales reps and recruiters often use this tool to scrape emails from LinkedIn.Regardless of whether the connection is a first-grade connection or not, you can extract them. You can export the contacts to CSV/Excel files, ATS(HR software), or CRM. Please feel free to get more information on SalesQL here.
In order to scrape data from some of the above scraper tools, it would help if you knew how to set these parameters:
Threads are the number of currently open connections you’re using to scrape data from LinkedIn or any other website. The more the number of threads, the faster the scraping process. However, LinkedIn will flag you quickly.
So although many scrapers set 10 proxies per thread, the best option would be to use one thread per proxy. As anything more, LinkedIn would be suspicious that you may ultimately end up in court, even though it may slow the scraping process.
Timeouts are the amount of time it takes for the server to respond to a proxy before the proxy starts a new request. Many scrapers set the timeouts to 1 or 2 seconds. Then it would overwhelm the results. However, we do not recommend it, and consistently, you need to set your timeouts to a higher level, say 20-30 seconds. This gives the server a solid pause before accepting a new request.
In this last section, you would discover how to scrape a private profile. However, scraping private profiles for emails and other details will raise legal issues for LinkedIn.This is because when you create a LinkedIn account, LinkedIn promises to protect your data and not disclose it to third parties.
You may scrape someone’s data, including email addresses, for non-destructive purposes. They include maybe you’re on a job hunt and looking for freelance technical writers in a particular city. On the other hand, you might be scraping for research. So with that in mind, let’s find out how to scrape private profiles.
The ideal way to scrape from private profiles is by creating a user account. Then you would be able to connect with as many contacts as possible. However, it would help if you kept in mind that this account is not for connecting with people. Instead to use it as an entry point to LinkedIn for scraping purposes.
I recommend using Octoparse for this purpose. This is because it allows you to log in to LinkedIn with your account and apply appropriate searches. Then you can scrape with the drag and drop feature while showing the LinkedIn page you’re on.
After creating the account and when you’re ready to search with Octoparse, you need to figure out what to search. Then Octoparse will explore what you require precisely. However, you can harvest only the available information as a non-connection, including the contacts’ email addresses.
LinkedIn will most likely block you with the above method if you do not obey the rules of timeouts and threads.
Also, make sure that when you create an account, use a proxy server and use the same IP address when scraping through Octoparse. This is because you appear as a human to LinkedIn as most humans do not access LinkedIn from different IP addresses in split seconds. So if you use a proxy to create an account, use the same proxy when scraping LinkedIn.
Now it’s a matter of which proxies to use and how many proxies to use.
What proxies to use for LinkedIn scraping?
The ideal proxies for scraping LinkedIn are elite proxies. This is because elite proxies provide higher anonymity and secure header settings than the other proxy types.
Before scraping, you need to be well aware that LinkedIn doesn’t like scraping from it and deals with the culprits seriously. This implies that you have to use dedicated Elite proxies and Elite private proxies. Shared or free proxies are simply out of the equation for this purpose.
Number of Proxies
The number of proxies would largely depend on the size that you”ll be scraping. As mentioned above, when using a single proxy per account, it is recommended to use 50 accounts and 50 proxies.
Suppose you want to have more proxies per account which we don’t recommend, use somewhere in the range from 100-150. Then rotate them often so that LinkedIn doesn’t caught and ban them.
On the other hand, if you have fewer proxies, LinkedIn would likely ban them. So to determine the best figure, you must experiment as possible with LinkedIn. This will also help you to identify if any of your proxies are blacklisted by LinkedIn.
After reading this article, we hope you have a comprehensive idea about how you can scrape emails through LinkedIn. The easiest and most obvious way is to use the manual method. However, that will not return a handful of results. So the most obvious choice would be to use an automated tool together with elite proxies.
We wish best of luck in extracting emails from LinkedIn using the methods mentioned here.