SensitiveInfoCheck
Available in Liquibase 5.1 and later, the SensitiveInfoCheck Policy check helps you identify and prevent sensitive personal identifiable information (PII) from being deployed through your database changelogs. This check employs the Phileas library of Identifiers to scan INSERT and UPDATE SQL statements in your changesets for common PII identifiers, such as social security numbers, credit card numbers, email addresses, and phone numbers. When the check detects potential PII, it flags the changelog and can block deployment, protecting your organization from accidentally committing sensitive data to version control or deploying it to your databases. You can customize which PII identifiers to detect and configure whether violations should trigger warnings or hard stops in your deployment pipeline. This automated detection also supports compliance efforts by providing audit-ready evidence that your team is actively monitoring and preventing PII exposure in database changes.
Scope | Database |
| Any SQL database |
Known limitations
The SensitiveInfoCheck has the following detection limitations:
Phone numbers:
The
PHONE_NUMBERidentifier does not detect vanity phone numbers containing letters instead of digits (example: 1-800-FLOWERS).Email addresses:
The
EMAIL_ADDRESSidentifier does not detect email addresses missing a top-level domain (TLD), even when you setvalidateTLD=falseExample:
user@domainwithout.com,.org, etc..MAC addresses:
The
MAC_ADDRESSidentifier detects standard colon-separated (00:1B:44:11:3A:B7) and hyphen-separated (00-1B-44-11-3A-B7) formats but does not recognize Cisco dot notation (001B.4411.3AB7).Address suffixes: The underlying Phileas library requires a street suffix for address detection. Addresses without suffixes, such as Street, Way, Blvd, etc, will not be identified as PII, even though they are valid USA addresses.
Before you begin
Create a Check Settings file
Update any existing Check Settings file to 5.1+ by running liquibase checks show --auto-update=ON to ensure the SensitiveInfoCheck appears on the list of checks.
(Maven users only) Add the
liquibase-checksdependency to yourpom.xmlfile. See Add extensions with Maven for more information.Ensure you have Java 17+ installed. If you used the Liquibase Installer, Java is included automatically. Otherwise, you must install Java manually.
Procedure
Enable the SensitiveInfoCheck
This check is disabled by default. To enable it, run the checks enable command:
liquibase checks enable --check-name=SensitiveInfoCheck
Create a Checks settings file if it isn't created already.
If you don't already have a Checks settings file created, this message will appear:
WARNING: No default checks-settings file detected. Would you like to create and automatically use 'liquibase.checks-settings.conf' ?
[1] Yes, create and use default check-settings file in the current working directory
[2] Yes, create and use default check-settings file at a path I will specify
[3] Just create default check-settings file in the CWD, do not use, and exit command
[n] No, do not create default check-settings file, and exit command
Choose option one or two, depending on where you want your Checks settings file to be located.
If you already have an existing Checks settings file, this message will appear:
ALERT: A valid Liquibase Checks Settings file already exists at '[path/to/liquibase.checks-settings.conf]'.
Do you want to:
[O] Overwrite the existing file
[C] Continue using the existing file
Choose either option to add the new check to the Checks settings file you want to use.
If you get an error message that states: One or more specified checks not found: 'sensitiveInfoCheck'.
Update the file to 5.1+ by running the command below to ensure the
SensitiveInfoCheckappears on the list of checks.liquibase checks show --auto-update=ONRun the command in step 1 again.
Choose the PII identifier you want to search for.
In this step, we will choose a Phileas Identifier to flag which personal information the check triggers. We recommend that you name your check based on the type of identifier that you want to flag, so that you know what sensitive information is being committed. If you are flagging multiple identifiers, ensure the title reflects all of them. The table below can help you group your identifiers into bundles to facilitate the search for specific items in your database.
Examples:
If you only want an individual identifier to trigger, you could name it:
SSN_SensitiveInfoCheckto flag that Social Security Numbers exist in your changelog.If you want to flag a bundle of identifiers, you could name it:
Financial_Info_SensitiveInfoCheckto flag that any financial information such as credit card numbers or routing numbers exist in your changelog.If you want to flag all identifiers, you could name it:
AllSensitiveInfoCheckto flag that any identifiers exist in your changelog.
The example message appears in the terminal.
Example message
The check 'SensitiveInfoCheck' can not be enabled directly because one or more fields does not have a default value.
Create a copy of this check and initiate the customization workflow.
Give your check a short name for easier identification (up to 64 alpha-numeric characters only) [SensitiveInfoCheck1]:
Once you add the desired name of the copied check in the CLI, Liquibase will make a copy of the SensitiveInfoCheck, and you will see the example output appear in the terminal.
Example output
New check 'SSNSensitiveInfoCheck' created from 'SensitiveInfoCheck'
Personal Information
Identifier | What It Detects | Example |
|---|---|---|
SSN | US Social Security Numbers in XXX-XX-XXXX format | 123-45-6789 |
FIRST_NAME | Common first names from dictionary | John |
SURNAME | Surnames from dictionary | Berezenskyi |
AGE | Age references in text | "Patient is 67 years old" |
DATE | Dates in various formats (MM/DD/YYYY, Month Day Year) | 02/29/2023 |
Contact Information
Identifier | What It Detects | Example |
|---|---|---|
EMAIL_ADDRESS | Email addresses in standard format | |
PHONE_NUMBER | Phone numbers in various formats | 555-123-4567 |
PHONE_NUMBER_EXTENSION | Phone extensions (ext, x) | "Call ext 1234" |
Location Data
Identifier | What It Detects | Example |
|---|---|---|
ZIP_CODE | US ZIP codes in 5-digit or ZIP+4 format | 90210 |
CITY | US city names from dictionary | New York |
STATE | US state full names | California |
STATE_ABBREVIATION | US state two-letter codes | CA |
COUNTY | US county names | Los Angeles County |
STREET_ADDRESS | Street addresses with numbers and suffixes | 123 Main Street |
Financial Information
Identifier | What It Detects | Example |
|---|---|---|
CREDIT_CARD | Credit card numbers (Visa, MasterCard, Amex, Discover) | 4532015112830366 |
BANK_ROUTING_NUMBER | 9-digit US bank routing numbers | 021000021 |
IBAN_CODE | International Bank Account Numbers | GB33BUKB20201555555555 |
BITCOIN_ADDRESS | Bitcoin addresses (P2PKH, P2SH, Bech32) | 1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa |
CURRENCY | Currency amounts with symbols ($, €, £) | $1,234.56 |
Government IDs
Identifier | What It Detects | Example |
|---|---|---|
DRIVERS_LICENSE | Driver's license numbers | DL123456789 |
PASSPORT_NUMBER | Passport numbers (9 digits or alphanumeric) | 123456789 |
VIN | Vehicle Identification Numbers (17 characters) | 1HGBH41JXMN109186 |
Medical Information
Identifier | What It Detects | Example |
|---|---|---|
HOSPITAL | Hospital names from dictionary | Mayo Clinic |
HOSPITAL_ABBREVIATION | Hospital abbreviations (MGH, CHOP, etc.) | MGH |
PHYSICIAN_NAME | Physician names with Dr. title | Dr. Anderson |
Network Information
Identifier | What It Detects | Example |
|---|---|---|
IP_ADDRESS | IPv4 and IPv6 addresses | 192.168.1.100 |
MAC_ADDRESS | MAC addresses in colon-separated format | 00:1B:44:11:3A:B7 |
Other Identifiers
Identifier | What It Detects | Example |
|---|---|---|
TRACKING_NUMBER | Shipping tracking numbers (UPS, FedEx, USPS) | 1z1234567890123456 |
URL | URLs with various protocols (http, https, ftp) |
Set the Severity level.
This message will appear in the terminal:
Set the Severity to return a code of 0-4 when triggered. (options: 'INFO'|0, 'MINOR'|1, 'MAJOR'|2, 'CRITICAL'|3, 'BLOCKER'|4)? [INFO]:
In this example, we will set the severity to 0.
Set the Identifiers you want to search the Changelog for.
This message will appear in the terminal:
Set 'IDENTIFIERS' as a comma-separated list of identifier types (options: SSN, ZIP_CODE, CREDIT_CARD, EMAIL_ADDRESS, PHONE_NUMBER, PHONE_NUMBER_EXTENSION, IP_ADDRESS, MAC_ADDRESS, FIRST_NAME, SURNAME, AGE, DATE, CITY, STATE, STATE_ABBREVIATION, COUNTY, STREET_ADDRESS, BANK_ROUTING_NUMBER, BITCOIN_ADDRESS, CURRENCY, DRIVERS_LICENSE, HOSPITAL, HOSPITAL_ABBREVIATION, IBAN_CODE, PASSPORT_NUMBER, PHYSICIAN_NAME, TRACKING_NUMBER, URL, VIN, IDENTIFIER):
In this example, we will check for SSN in our changelog by typing SSN into the CLI and pressing enter. If you want to set the identifier to any of the checks that are in the Optional additional identifier configurations table below, be aware there are optional configurations you will need to set to use these identifiers successfully. Note: This policy check must include one or multiple identifiers. Use a comma-separated list to specify more than one. If you choose to flag multiple identifiers, we recommend grouping them by category, such as Financial Information or Location Data. It is easier to manage and analyze them this way instead of flagging all of them at once.
Optional additional identifier configurations
If you choose one of the identifiers below, please note that you will have configuration options to select from.
PII Type | Configuration Option | Default | Description |
|---|---|---|---|
ZIP_CODE |
|
| Enables census database validation |
CREDIT_CARD |
|
| Enables Luhn algorithm validation |
|
| Excludes timestamp patterns | |
EMAIL_ADDRESS |
|
| Uses strict RFC validation |
|
| Validates top-level domains | |
DATE |
|
| Validates calendar dates |
IBAN_CODE |
|
| Uses spaced IBAN format (limits to 22-char IBANs) |
IDENTIFIER |
| N/A | Accepts any regex pattern |
Set the message you want to display when the check triggers.
You will see the default message in the terminal. The default message pinpoints the location of the detected SSN, which is usually all you need. You can customize it later to show more or less detail if you prefer.
Set 'MESSAGE' [Policy violation: raw {filter_type} detected in {stmt_type} at statement #{statement_number}, line {line_number}, positions {start}-{end}.{file_line_info}]:
Press Enter to save the message formatting.
Results
The command is enabled successfully.
Customization complete. Review the table below to confirm your changes.
+-----------------------+-----------+--------+----------+----------+--------------------------------+--------------------------------+
| Short Name | Scope | Type | Status | Severity | Customization | Description |
+-----------------------+-----------+--------+----------+----------+--------------------------------+--------------------------------+
| SensitiveInfoCheck | changelog | sql, | disabled | 0 | IDENTIFIERS = null | This check triggers when PII |
| | | xml, | | | MESSAGE = Policy violation: | (Personally Identifiable |
| | | yaml, | | | raw {filter_type} detected in | Information) or PHI (Protected |
| | | json | | | {stmt_type} at statement | Health Information) are |
| | | | | | #{statement_number}, line | detected in changelogs or in |
| | | | | | {line_number}, positions | your database. Copy this check |
| | | | | | {start}-{end}.{file_line_info} | to configure the subset of |
| | | | | | | 'identifiers' you wish to |
| | | | | | | detect. |
+-----------------------+-----------+--------+----------+----------+--------------------------------+--------------------------------+
| SSNSensitiveInfoCheck | changelog | sql, | enabled | 0 | IDENTIFIERS = SSN | This check triggers when PII |
| | | xml, | | | MESSAGE = Policy violation: | (Personally Identifiable |
| | | yaml, | | | raw {filter_type} detected in | Information) or PHI (Protected |
| | | json | | | {stmt_type} at statement | Health Information) are |
| | | | | | #{statement_number}, line | detected in changelogs or in |
| | | | | | {line_number}, positions | your database. Copy this check |
| | | | | | {start}-{end}.{file_line_info} | to configure the subset of |
| | | | | | | 'identifiers' you wish to |
| | | | | | | detect. |
+-----------------------+-----------+--------+----------+----------+--------------------------------+--------------------------------+
Liquibase command 'checks enable' was executed successfully.
Run the checks run command
After enabling and customizing the SensitiveInfoCheck, run it against your changelog to detect the identifiers you've configured. Your customized check is stored in your checks settings file.
To run checks with a custom settings file, use the --checks-settings-file parameter. If you don't specify this parameter, Liquibase uses the default liquibase.checks-settings.conf file. You can also use the --changelog-file flag to target a specific changelog.
Checks settings file example:
liquibase checks run --checks-settings-file=path_to_my_file_name.yaml
Checks settings file and specific changelog file example:
liquibase checks run --checks-settings-file=<path_to_my_file_name>.yaml --changelogfile --<your_changelog_here>.sql
Note: Be sure to replace path_to_my_file_name.yaml with the path to the settings file where you included the new policy check. Also be sure to replace your_changelog_here with your real changelog name.
Example output - No violation
Example output - Check violation found
Notice that the customized SSNSensitiveInfoCheck flagged two violations, and their messages tell you where the violation occurred.CHANGELOG CHECKS
----------------
Checks completed validation of the changelog and found the following issues:Check Name: Sensitive Info Check| PII and PHI Check (SSNSensitiveInfoCheck)
Changeset ID: raw
Changeset Filepath: changelogIncludeAllIdentifiers.sql
Check Severity: INFO (Return code: 0)
Message: Policy violation: raw SSN detected in INSERT at statement #1, line 1, positions 63-74.Check Name: Sensitive Info Check| PII and PHI Check (SSNSensitiveInfoCheck)
Changeset ID: raw
Changeset Filepath: changelogIncludeAllIdentifiers.sql
Check Severity: INFO (Return code: 0)
Message: Policy violation: raw SSN detected in INSERT at statement #25, line 2, positions 10-19.Changesets Validated: in changelogIncludeAllIdentifiers.sql
ID: raw; Author: includeAllChecks run against each changeset:
Sensitive Info Check| PII and PHI Check (Short names: SSNSensitiveInfoCheck)
liquibase checks run --checkname < yourCustomizedCheckName > --changelogfile --< yourChangelogHere >.sql
liquibase checks run --checks-settings-file=my_file_name.yamlAnalyze the results in the terminal output.
When the check finds an item flagged by the customized SensitiveInfoCheck in the changelog, you will see an output similar to this from the terminal:
CHANGELOG CHECKS
----------------
Checks completed validation of the changelog and found the following issues:
Check Name: Sensitive Info Check| PII and PHI Check (SSNSensitiveInfoCheck)
Changeset ID: raw
Changeset Filepath: changelogIncludeAllIdentifiers.sql
Check Severity: INFO (Return code: 0)
Message: Policy violation: raw SSN detected in INSERT at statement #1, line 1, positions 63-74.
Check Name: Sensitive Info Check| PII and PHI Check (SSNSensitiveInfoCheck)
Changeset ID: raw
Changeset Filepath: changelogIncludeAllIdentifiers.sql
Check Severity: INFO (Return code: 0)
Message: Policy violation: raw SSN detected in INSERT at statement #25, line 2, positions 10-19.
Changesets Validated: in changelogIncludeAllIdentifiers.sql
ID: raw; Author: includeAll
Checks run against each changeset:
Sensitive Info Check| PII and PHI Check (Short names: SSNSensitiveInfoCheck)
Notice that the customized SSNSensitiveInfoCheck flagged two violations and their messages tell you where the violation occurred.
Results
Now you can see how the check runs and what the results look like when it flags an issue within the changelog.
You can then remove, mask, or handle the sensitive information appropriately before committing changes to your database.