How to Use OSINT to Detect Data Leaks and Breaches

Written by Liferaft | August 14, 2023

At LifeRaft, we know from talking to hundreds of security professionals that data leaks and breaches represent a growing concern.

A single incident can cost victim companies millions of dollars in litigation and customer compensation – not to mention the damage to the organization’s reputation.

These events can present a physical security risk, too.

That’s because data leaks often include a swath of sensitive personal information. All of which could be weaponized by attackers to harm a VIP or principal.

And given disclosed breaches exposed over 21 billion records last year, we can say with a high degree of confidence that someone in your organization has (or will soon have) sensitive data leaked online.

So how can security teams spot these incidents quickly to mitigate this threat?

Thankfully, open-source intelligence, or OSINT, can go a long way towards addressing this problem.

Let’s take a look at how to use OSINT to detect data leaks and breaches, as well as common issues that pop up when tackling these incidents.

Learn More: 5 Cognitive Biases That Could Affect Your OSINT Investigations

What is a Data Breach and How Do They Jeopardize Security?

Data breaches refer to any security incident where an unauthorized person views, copies, transmits or uses protected information.

In many of these incidents, these unauthorized individuals will publish breached data on the web. Such information could be released publicly for free or sold to other criminal outfits.

In most cases, breached data sets contain a list of usernames, passwords, and email addresses. Large leaks, however, could include other types of personally identifiable information (PII) such as:

Real names
Phone numbers
Credit card information
Health records
Passport details
Social insurance numbers

These events can come with a big price tag for victim organizations.

According to IBM’s 2021 Cost of a Data Breach report, these events cost companies $4.2 million on average per incident.

The study also showed that slower response times resulted in substantially higher damages – with costs rising nearly 30% if security teams detected a breach after 200 days.

This figure includes the explicit expenses of addressing such events, such as litigation costs, compensating impacted customers, and patching existing infrastructure. The study authors also attempted to measure the damage to the organization’s reputation and corresponding lost sales.

But as mentioned, breaches also present a risk to any individuals who have had their data compromised.

For example, leaked information can serve as a potential attack surface for scammers, hackers, online stalkers, and cyber activists.

Threat actors might exploit such information for harassing family members, spear-phishing campaigns or hacking other accounts owned by the individual.

Leaked personal information (redacted) of a senior executive on the darknet, including wi-fi passwords, home address, and family member details, discovered by Navigator.

Security teams should pay particular attention to exposed phone numbers.

Potential attackers can perform a reverse search to identify other accounts connected to that phone number -- even if victims created these accounts under a pseudonym.

Even worse, attackers could attempt to take control of any leaked phone number via a SIM swap. If you use this number to receive two-factor authentication codes for various services, this technique could allow outsiders to access these accounts.

In other words, data leaks and breaches don’t just represent a problem for cybersecurity professionals.

They also present challenges for individuals operating on the physical side of security operations and any close protection detail.

Unfortunately, data breaches now represent an almost inevitable crisis.

In the same IBM report mentioned above, the study’s authors estimated any individual company now faces a 30% chance of suffering a data breach within the next 24 months.

Now multiply that probability across all of the firms you or your client do business with daily.

The question isn’t, could such an event occur? But instead, when will it happen next?

OSINT Tools to Detect Data Leaks and Breaches

So how should security teams and protection details address this problem?

Early detection through open-source intelligence represents one of the best ways to mitigate this threat. Additionally, performing an audit of public information is also an essential step when conducting any threat assessment.

After all, if you know what information is out there, it’s much easier to take the appropriate actions to safeguard your client organization or principal.

To help you get started in this process, we’ve highlighted a handful of (mostly) free OSINT tools to detect data leaks and breaches online.

1. DataBreaches.net

Mainstream news outlets only report large breaches, such as the Facebook or LinkedIn leaks. That makes it easy for security teams to overlook smaller but more frequent incidents.

Enter DataBreaches.net.

The site represents an outstanding resource for keeping tabs on breaches that don’t receive widespread attention. It gives you a bird’s eye view of the type of organization attackers have targeted and what kinds of PII they want.

Note, DataBreaches.net does not provide any direct links to breached data itself. You will have to hunt those down on your own.

Still, it can serve as an effective ‘first alert’ service for security teams.

2. Have I Been Pwned

Have I Been Pwned is a free resource to quickly determine if a particular email address was compromised in a data breach.

On the site’s home page, type an email address into the search box and click Enter. A display will appear at the bottom of the screen showing whether or not this account has been exposed in a leak.

Note, HIBP does not provide a link to any original breach files. That said, it does provide the name of any leaks where the email address was found. You can also find a short description of each breach and the type of data compromised.

Additionally, HIBP provides a free monitoring service. If your email address appears in a future breach, the site will send you a notification.

3. DeHashed

DeHashed is a search engine for uncovering hacked databases and records.

Created for journalists and cybersecurity professionals, the site owners launched the service to help people keep their accounts secure and provide information on the latest breaches.

In many ways, DeHashed resembles HIBP. But the service allows users to go beyond simply looking up compromised email addresses. You can also search for other information like names, phone numbers, URLs, IP addresses, etc.

Users can start performing a basic search for free after creating an account. But detailed information will require paying for a subscription.

4. Intelligence X

Intelligence X represents a powerful search engine and data archive for OSINT investigators.

But unlike traditional search engines like Google or Bing, the site allows you to scan the web for specific selectors. For example, this could include information like domains, email addresses, IPs, URLs, and Bitcoin addresses.

Furthermore, Intelligence X also pulls in results from a wider variety of sources. In addition to the surface web, users can also retrieve data from the darknet, paste sites, file-sharing platforms, WhoIs records, and public data leaks.

5. Nuclear Leaks

Nuclear Leaks is no longer regularly updated. That said, it still represents a great list of historical breaches.

The site provides the name of the targeted organization, the date the breach occurred, the hashing algorithm, and the number of records compromised.

But because the group’s founders created the site for education and awareness, they do not provide links to any breached data.

How to Access Breached Data Directly

We’re reluctant to post any links to breached data directly. That said, leaked information is often published and archived across the web.

Here are the most common places where you can find such data sets:

Telegram: Telegram is a popular instant messaging service with hundreds of millions of active users. While most of the activities on there are benign, you can find some communities dedicated to selling and sharing breached data. Typically, hackers will publish dumps after they fail to find a buyer on the dark web. Though from time to time, leaks appear first on a Telegram channel.
Paste sites. Paste sites serve as online content-hosting services where users can store and share plain text files. Ordinary developers use these websites frequently to share code snippets. But criminals will exploit more nefarious platforms, such as DeepPaste, to advertise breached data sets for sale or dump stolen information.
The dark web. Anonymity on the dark web makes it the perfect place to publish stolen information. Dark web marketplaces serve as the go-to spot for criminal outfits to buy and sell breached data. In their advertisement listings, sellers will typically reveal where the data came from and a preview of the type of information available. Dark web forums, in contrast, act more like paste sites where users dump breached data sets.
Non-profits. Many non-profit groups have set up online clearinghouses to publish data leaks and classified material. Wikileaks, for example, represents one of the most renowned of these publications. Though in recent years, dozens of similar services have cropped up. These sites serve as repositories for many publicly-available data breaches.

What You Should Know About Using Breached Data

Data leaks and breaches represent valuable sources during investigations. Analysts can mine these data sets to uncover new leads or confirm existing information about a person of interest.

And because everyone can access this information online, it technically falls under the umbrella of OSINT.

That said, breached data comes with risks analysts and security teams should consider.

Understand the laws and regulations in your area. For instance, if your company or client operates in the European Union, GDPR rules come into play. In the United States, analysts could have to report certain types of illegal content they uncover in their investigation. In other jurisdictions, courts may throw out evidence obtained through breached data.
Check your organization’s policy on collecting and analyzing breached data. Many people are uncomfortable with businesses collecting personal data, especially stolen data, even if this information is out in the open online. For this reason, many companies forbid analysts from accessing breached data as it could damage public relations.
Consider operational security risks. Before downloading any breached data, use protective measures to safeguard your identity and workstation. Standard safeguards include using a virtual machine, VPN service, and privacy-conscience browser. Download files directly into cloud storage or a sandbox environment. After all, you don’t always know the source of this information. And typically, breached data is only sourced from the doggier corners of the web.

The Bottom Line on Data Leaks

Data leaks are not only a problem for cybersecurity professionals. Given the many ways attackers can exploit breached data, they also present a risk to physical safety.

For that reason, it makes sense to keep tabs on what information is floating around the web from previous breaches.

And using OSINT to detect data leaks represents one of the best ways to mitigate this threat

5 OSINT Investigation Mistakes You Don’t Know You’re Making

We’re all ‘hard wired’ with cognitive biases to simplify complex decisions. But these mental heuristics can lead to faulty analysis, sabotaging OSINT investigations. In our free guide, we list the five most common cognitive biases that plague open-source analysts and outline practical strategies for how to deal with them.

LEARN MORE HERE >

View full post