Data loss prevention

What is data loss prevention and why should you be bothered? Imagine that you are the information security manager at a large hospital. One of your primary responsibilities is to prevent data loss and protect the sensitive personal and medical information of the hospital’s patients.

One day, you receive a report that an employee has accidentally left a laptop containing patient data on a bus. The laptop was not encrypted and could potentially be accessed by anyone who finds it.

You immediately go into crisis mode, trying to assess the extent of the data loss and determine the best course of action to take. You know that this data could be used for identity theft or other malicious purposes and that it could also result in significant fines for the hospital if it is not properly secured.

You work with your team to determine the specifics of the data that was lost, including which patients’ information was on the laptop and what types of data were involved. You also work with the relevant authorities to try to locate the laptop and recover it, if possible.

In the meantime, you must also inform the affected patients about the data loss and offer them resources to help protect themselves from identity theft. You also work with your legal team to determine the necessary steps to take to report the data loss and minimize the hospital’s liability.

Overall, this scenario demonstrates the importance of data loss prevention measures in organizations. This blog will look at data loss prevention in detail and how Google Workspace can help you achieve this to avoid the above scenario.

What’s Data Loss Prevention (DLP)?

Data Loss Prevention is a system established to detect and monitor data breaches. It also works to stop unauthorized transfers and sharing of data from your company or organization’s Google accounts.

Use data loss prevention (DLP) policies to detect sensitive information, such as credit card numbers, in email and Google Drive files. You can set up policy-based actions and block users from sharing email and Drive files when sensitive content is detected:

Supported editions/plans for this feature: 

  • Enterprise
  • Education Fundamentals
  • Education Standard, 
  • Teaching and Learning Upgrade
  • Education Plus.

Data loss prevention benefits organizations by helping them detect and prevent sensitive data loss, leaks, exfiltration, and breaches.

Why Data Loss Prevention?

Using data loss prevention (DLP), you can create and apply rules to control the content that users can share in files outside the organization. DLP gives you control over what users can share and prevents unintended exposure of sensitive information such as credit card numbers or identity numbers.

It’s possible to use DLP to prevent or warn users from sharing sensitive content and information, like confidential data and customer social security numbers.

As an admin, you can also use the system to receive alerts regarding policy violations and DLP incidents. You can also use it to investigate information on policy violations.

DLP rules can help prevent the following types of data security incidents:

Data loss prevention rules
Data security incidents
  • Data breach – is a security incident involving unauthorized third-party access to an organization’s critical data. Cybercriminals typically breach an organization’s security to sell sensitive data. Data breaches can be carried out using various methods such as social engineering, malware, and hacking.
  • Data exfiltration – is the unauthorized and intentional transfer of critical data out of an organization’s perimeter. Also referred to as data theft, exfiltration is mostly done by malicious insiders acting as authorized employees.
  • Data leak – is an unauthorized but unintentional transmission of sensitive data from an organization’s perimeter. Leaked sensitive data may be exposed publicly or used maliciously by third parties to perform a cybercrime such as identity theft
  • Data loss – is any event or process that leads to data being corrupted, damaged, or completely destroyed, and therefore inaccessible or unusable.

How to Apply Data Loss Prevention within Drive

You can create and apply rules to control the content that users can share in files outside the organization. 

Google Drive DLP works well in conjunction with Drive’s new Label feature which allows you to better improve the security of your data.

Google Drive DLP works well in conjunction with Drive’s new Label feature
Google Drive DLP works well in conjunction with Drive’s new Label feature

DLP rules trigger scans of files for sensitive content and prevent users from sharing that content. Rules determine the nature of DLP incidents, and incidents trigger actions, such as the blocking of specified content.

You can allow controlled sharing for members of a domain, organizational unit, or group.

The flow of Data Loss Prevention is as follows: 

  • You define DLP rules. These rules define which content is sensitive and should be protected. DLP rules apply to both My Drive and Shared drives.
  • DLP scans content for DLP rule violations that trigger DLP incidents.
  • DLP enforces the rules you defined and violations trigger actions, such as alerts.
  • You are alerted of DLP rule violations.

Implementation of Data Loss Prevention

  1. Getting to know the data you have – First things first, it is hard to protect information if you don’t know what type of information you currently have, let alone know where it is currently stored! That’s why it’s important to first think about turning your data into information. This is not an easy task as it requires some business analysis to understand what bunch of data (eg documents, slides & sheets) contain what type of information. For example, Payslip documents are HR information, Customer sheet is sales information, and Project kickoff slides are information related to a specific project.
  2. Classifying information into categories – Once you know what type of information your business possesses, you can move on to organizing this information. Classifying is basically categorizing your information based on how confidential it is. Then, per category, you would need to define a location with the necessary security settings relevant to the information stored there.
  • You can apply a certain label on an OU or a google group. This would mean that every document created by a user in that OU or group would get this label. e.g. Sales team is in the sales OU. A drive label “sales information” is applied to that OU. For Example, Eddie is in the OU Sales when he creates a document it is labeled as “sales information”
Classifying information into categories
  • You can apply a certain label based on content. This would mean that if a document has certain content it would be labeled as a specific type of information. e.g. A DLP rule exists where every document containing “Confidential”, “Project” or ”NDA” is labeled as “Confidential”. Hank opens up a spreadsheet to summarize his sales. As the word NDA is mentioned in this PDF it is labeled as “Confidential”
You can apply a certain label based on content
You can apply a certain label based on content

Applications and file types scanned by Data Loss Prevention

What are the different file types that will be scanned by the DLP rules?

Applications scanned include:

  • Sheets
  • Docs
  • Slides
  • Forms File Upload—Files submitted to Forms file upload questions are scanned by DLP. Responders may be warned or blocked from submitting their responses if they attempt to upload sensitive content.

Comments in Docs, Sheets, Slides, and Drawings and comment email notifications are not scanned by DLP. Also, Sites and Forms (other than File Upload) are not supported with DLP.

File types scanned for content include:

  • Document file types: .doc, .docx, .html, .pdf, .ppt., .wpd, .xls, .xlsx, .xml
  • Image file types: .bmp, .eps, .fif, .gif, .img_for_ocr, .jpeg, .png, .ps, .tif
  • Compressed file types: .7z, .bzip, .gzip, .rar, .tar, .zip
  • Custom file types: .hwp, .kml, .kmz, .sdc, .sdd, .sdw, .sxc, .sxi, .sxw, .ttf, .wml, .xps

Video and audio file types are not scanned.

Creating DLP for Drive rules and custom content detectors

Using the data loss prevention (DLP) for Drive, you can create complex rules that combine triggers and conditions. You can also specify an action that sends a message to the user that their content has been blocked.

The steps include:

  1. Plan your rules – DLP allows you to create rules to protect sensitive content. Before creating these rules, decide on the conditions you will add to the rules.

  2. Create a custom detector – These are general instructions for creating a custom detector if you need to use one in rule conditions.

  3. Create rules 

  4. Tell users about the new rule – Set user expectations as to behavior and consequences of the new rule.

Applying Data Loss Prevention in Gmail

Gmail data loss prevention (DLP) lets you use predefined content detectors when scanning inbound or outbound emails. Google specifically designed these predefined detectors to locate sensitive data, such as credit cards, Social Security, or passport numbers.

Like a standard Gmail content compliance setting, you can use DLP detectors to trigger automatic responses. These include quarantining, rejecting, or modifying a message. You can also combine predefined detectors with keywords or regular expressions to create more sophisticated content compliance policies.

What messages does Google Data Loss Prevention scan?

What messages DLP for Gmail scans depends on the company policy and its desired level of prevention. The Google Workspace administrator sets this, choosing a DLP policy that covers one or several types of communication:

  • Inbound emails from outside the list of domains tied to the enterprise;
  • Outbound emails outside the enterprise’s network;
  • Internal emails received from within the enterprise’s domain; and
  • Internal emails sent within the enterprise’s network.
Messages that Google Data loss prevention scans
Messages that Google DLP scans

What content is detected?

With DLP for Gmail, the Workspace admin sets what content is to be detected by the trigger system. In all, there are three types of content that can serve as triggers: exact, context, or message metadata. These include the following.

  • Specific expression triggers: any words, specific phrases, or combinations of words;
  • Pre-set content match triggers: item size, source IP, message authentication, and if the communication has TLS encryption; 
  • Metadata attribute triggers: countries and international detector patterns, including CCN numbers, passport numbers, Social Security numbers, and more.

For each trigger, the system runs an analysis of the content of the data (for example, scanning for 9 digits of a Social Security number). Then it analyzes the context (looking for specific words such as SSN, social, social security, etc). To add content detectors that are not currently supported, admins have to contact support and request the detector’s inclusion.

Predefined content detectors

What happens when Google Data Loss Prevention flags content?

When DLP for Gmail detects sensitive information, it executes one of the following actions.

  • Modify message: this might be bypassing filters, deleting attachments, including additional recipients, or requiring secure (encrypted) transport;
  • Reject sending or receipt; and
  • Quarantine: send the message to admins who review, allow, and deny communications containing sensitive data.

What if we don’t have a Data Loss Prevention rule in place?

The loss of sensitive data and other forms of enterprise information can lead to significant financial losses and reputational damage.

Neglecting to have rules in place within your organization will result to:

  • You aren’t sure where your company’s confidential data is being stored, where it’s being sent, and who is accessing it.
  • Organization data will not be protected against theft and accidental disclosure of sensitive information by employees and partners.
  • Not able to monitor your organization for inappropriate employee conduct and maintain forensic data of security events as evidence.

What if our organization is on a Google Workspace Plan that does not support DLP?

If your organization is on a Google Workspace plan that does not support Data Loss prevention, do not worry. You may be able to achieve data leakage within Gmail by performing advanced configuration within your Google Workspace tenancy. This can be archived by implementing a content compliance rule within your admin console for words or content you don’t want to share outside your organization. You can reach out to our team for advice on how to implement this for your organization.


You can never be too safe when it comes to data security and data loss prevention. Properly set up the Data Loss Prevention for your Google Workspace Apps, properly set the rules, and you can rest easy knowing that Google has your back in protecting your sensitive data.