LinkedIn’s 1.2B Data-Scrape Victims Already Being Targeted by Attackers

linkedin spear phishing

A refined database of 88K U.S. business owners on LinkedIn has been posted in a hacker forum.

Just days after a yet another data-scraping operation aimed at LinkedIn was discovered, evidence has popped up in a popular hacker forum that the vast amount of lifted data is being collated and refined to identify specific targets.

This might signal the start of a series of LinkedIn-fueled attacks.

The latest data scrape was discovered this week when threat actors posted the personal data contained in 700 million LinkedIn user profiles in the RaidForums underground market. Later, the operators boosted the listing to a purported 1 billion records, according to researchers at Privacy Sharks who discovered it. And this latest data scrape follows an April operation which exposed 500 million LinkedIn users.  

That’s a total of at least 1.2 billion records and maybe more — personal and professional — out there just waiting to be turned against users in future phishing, ransomware, display-name spoofing or other attacks (of course, some of the records are likely duplicates). But in any event, it’s already happening.

Yesterday, a database filled with the personal information of 88,000 U.S. business owners gleaned from the latest LinkedIn data scrape was shared in RaidForum, which the poster said specifically isolated U.S. business owners who have changed jobs over the past 90 days, CyberNews reported. The notably targeted database includes full names, email addresses, work details and any other information publicly listed on LinkedIn.

It’s not hard to see how this particular group of people, fresh on a new job, flooded with onboarding paperwork and dealing with new co-workers might be easily tricked into clicking on a malicious link.

LinkedIn: ‘It’s Not a Breach if It’s Public Info’

LinkedIn’s response acknowledges the abuse of LinkedIn data, but points out that it’s not technically a breach since the information was public.

“Our teams have investigated a set of alleged LinkedIn data that has been posted for sale,” the company’s statement to Threatpost said. “We want to be clear that this is not a data breach and no private LinkedIn member data was exposed.”

A LinkedIn’s spokesperson didn’t comment directly when Threatpost asked about data-scraping protections the company has or is planning to put in place, but did stress that the company is actively working to protect their members’ data.

“I want to be clear that scraping data from LinkedIn is a violation of our Terms of Service and we are constantly working to ensure our members’ privacy is protected,” she said. “When anyone tries to take member data and use it for purposes LinkedIn and our members haven’t agreed to, we work to stop them and hold them accountable.”

Accountability is important for future prevention, but once the data is scraped, organized and released out into the wild, there’s little that can be done to rein it back in.

Victoria Kivilevich, a threat intelligence analyst at KELA, offered Threatpost some insights on the origin of the scraping.

“We have obtained the sample records shared by the actor,” she said. “According to our review of the headers of the fields available in the data set, we assess with high confidence that it was scraped using this API of a company named Growth Genius, a Canadian sales automation platform.”

Below are the headers that Kivilevich provided, as seen in a search result from one of the sample files.
“Our assessment regarding the identity of the scraping API is based on the mention of ‘version’ in the last section of the data in combination with the word LinkedIn,” she explained.

What’s the Harm in Data Scraping?

“Data scraping is the process of extracting data from websites without the explicit permission of the individual whose data is being scraped,” Tom Kelly, president and CEO of IDX, explained to Threatpost. “It is often dangerous, because it leaves users’ personal identifiable information (PII) vulnerable and can lead to compromise of the individual’s privacy. Data scraping can open doors for cybercriminals and hackers to use this data to spearhead further cyberattacks and can give hackers to ability to perpetrate very effective spear-phishing attacks.”

But the solution isn’t as simple as blocking all data-scraping activity. There are plenty of legitimate uses for data scrapers, Andrew Useckas, CTO and co-founder at ThreatX told Threapost.

“Data scraping or web scraping is semi-malicious activity,” Useckas said. “Whether it is considered good or bad, or happening a lot or little, is subjective. For example, big companies scrape their competitors to get latest pricing and info, etc. Strictly speaking, it’s not bad unless it causes issues for the customer. Some customers like scraping as it increases their marketing footprint.”

LinkedIn isn’t alone. In April, it was revealed the data of more than 533 million Facebook users was scraped in Sept. 2019. But LinkedIn’s public data is more valuable to threat actors, according to Hoala Greevy, founder and CEO of Paubox. The difference is the business intelligence that can be gleaned from LinkedIn.

“In today’s society, people keep their LinkedIn profiles studiously current,” Paubox explained to Threatpost. “Job title and current employer are especially manicured on LinkedIn. If this information can be scraped at scale, you can determine where everyone works and where everyone sits in the org chart.”

That information can help attackers collect vast amounts of information on how a business operates — and can go on to exploit it.

“If a bad actor can map out an organization’s org chart, they can use that to launch display-name spoofing attacks, which are targeted phishing attacks that use the display name field of an email to impersonate a person of authority (i.e., CEO, CFO),” Paubox added.

That attack works because, as Paubox pointed out, 70 percent of employee email is read on a smartphone.

When is Abuse as Bad as a Breach?

The worst-case scenario is that these massive troves of data are being aggregated and used by threat actors to make their attacks more personalized and potent, which is what appears to be happening to the lifted LinkedIn profile data of those 88,000 business owners whose data was just released into the cybercrime ecosystem.

“Although data scraping is less risky than hacking directly, in totality, it can leave unsuspecting folks vulnerable to being scammed,” Daniel Markuson, digital privacy expert at NordVPN, told Threatpost.

“As we saw in the Facebook scraped data-leak incident, the database did contain personal data like phone numbers and emails,” he added. “Essentially, if cybercriminals get ahold of this type of personal data, it can be used for better phishing attempts and various other forms of scams. ”

Ultimately, it’s up to the users to protect themselves from attacks fueled by data scraping by keeping PII off social media, enabling two-factor authentication (2FA) on accounts and remaining vigilant about identifying potentially malicious communications, including texts, email, voice messages and in-platform messaging services.

“If privacy is a priority, social media is not your friend,” Markuson said. “The rise of biometric data scraping (some corporations build their facial-recognition databases using images scraped from Facebook and Instagram) demonstrates that social media is a huge threat to personal privacy.”

Markuson also recommended that users opt out of any data-collection asks.

“To take matters into their own hands, people can manually opt out of the data brokers, by contacting companies like BeenVerified, Acxiom and PeopleFinder directly, and opt out of their data-collection practices,” he said.

Check out our free upcoming live and on-demand webinar events – unique, dynamic discussions with cybersecurity experts and the Threatpost community.

Suggested articles