varcharDataIntegrity

varcharDataIntegrity is a custom policy check that checks for numeric characters in VARCHAR columns.

Notes:

Only basic INSERT or UPDATE statements are supported Inserting multiple rows within same INSERT is not supported LoadData change types are not supported by checks Learn how to create and customize the varcharDataIntegrity Liquibase Custom Policy Check using a Python script.

This example works for relational databases. You can use this check as it is or customize it further to fit your needs in your SQL database.

For a conceptual overview of this feature, see Liquibase Pro Custom Policy Checks.

Before you begin

Scope	Database
database	Relational

Before you begin

Liquibase 4.29.0+
Python 3.10.14+
Configure a valid Liquibase Pro license key
Create a Check Settings file
Ensure the Liquibase Checks extension is installed. In Liquibase 4.31.0+, it is already installed in the /liquibase/internal/lib directory, so no action is needed.
If the checks JAR is not installed, download liquibase-checks-<version>.jar and put it in the liquibase/lib directory.
- Maven users only:
  Add this dependency to your pom.xml
  
  file: <dependency> <groupId>org.liquibase.ext</groupId> <artifactId>liquibase-checks</artifactId> <version>2.0.0</version> </dependency>
Java Development Kit 17+ (available for Open JDK and Oracle JDK)
Linux, macOS, or Windows operating system

Procedure

These steps describe how to create the Custom Policy Check. It does not exist by default in Liquibase Pro.

Add this code to your Checks Settings file:

varcharDataIntegrity Quotes Python Script

# # #
# # # This script checks
for numeric characters in VARCHAR columns
# # #
# # # Notes:
    # # # 1. Only basic INSERT or UPDATE statements are supported
# # # 2. Inserting multiple rows within same INSERT is not supported
# # # 3. LoadData change types are not supported by checks
# # #

# # #
# # # Helpers come from Liquibase
# # #
import liquibase_utilities
import shlex
import sys

# # #
# # # Functions
# # #
def check_data(string_data):
    ""
"Returns True if data is valid."
""
return not any(char.isdigit() for char in string_data)

def find_snapshot_object(object_list, type, key, value):
    ""
"Returns a snapshot object given a key (e.g., name) and attribute."
""
for object in object_list:
    if object[type][key].lower() == value.lower():
    return object
return None

def parse_parameters(string_data, whitespace = ","):
    ""
"Returns a list containing the string separated by whitespace characters."
""
lex = shlex.shlex(string_data, posix = True)
lex.whitespace += whitespace
return [data
    for data in list(lex)
]

# # #
# # # main
# # #

# # #
# # # Retrieve log handler
# # # Ex.liquibase_logger.info(message)
# # #
liquibase_logger = liquibase_utilities.get_logger()

# # #
# # # Retrieve status handler
# # #
liquibase_status = liquibase_utilities.get_status()

# # #
# # # Retrieve JSON snapshot
# # #
liquibase_snapshot = liquibase_utilities.get_snapshot()

# # #
# # # Exit
if column or table data is missing
# # #
if not all(key in liquibase_snapshot["snapshot"]["objects"]
        for key in ("liquibase.structure.core.Column", "liquibase.structure.core.Table")):
    liquibase_status.fired = False
liquibase_logger.warning("Column or Table data missing from snapshot. Check skipped.")
sys.exit(1)

# # #
# # # Retrieve columns and tables from snapshot
# # #
all_columns = liquibase_snapshot["snapshot"]["objects"]["liquibase.structure.core.Column"]
all_tables = liquibase_snapshot["snapshot"]["objects"]["liquibase.structure.core.Table"]

# # #
# # # Retrieve all changes in changeset
# # #
changes = liquibase_utilities.get_changeset().getChanges()

# # #
# # # Loop through all changes
# # #
for change in changes:
    # # #
# # # LoadData change types are not currently supported
# # #
if "loaddatachange" in change.getClass().getSimpleName().lower():
    liquibase_logger.info("LoadData change type not supported. Statement skipped.")
continue
# # #
# # # Retrieve sql as string, remove extra whitespace
# # #
raw_sql = liquibase_utilities.strip_comments(liquibase_utilities.generate_sql(change)).casefold()
raw_sql = " ".join(raw_sql.split())
# # #
# # # Split sql into statements
# # #
raw_statements = liquibase_utilities.split_statements(raw_sql)
for raw_statement in raw_statements:
    column_dict = {}
data_list = []
# # #
# # # Split raw_statement into list
# # #
sql_list = raw_statement.split()
try:
command_name = sql_list[0]
if command_name == "insert":
    table_name = sql_list[2]
elif command_name == "update":
    table_name = sql_list[1]
else:
    raise UserWarning
except IndexError:
    liquibase_logger.warning(f "Unsupported Insert/Update statement skipped: {raw_statement}")
continue
except UserWarning:
    liquibase_logger.info(f "Non Insert/Update statement skipped: {raw_statement}")
continue
# # #
# # # Remove schema
if provided, locate table
# # #
table_name = table_name.split(".")[-1]
table_object = find_snapshot_object(all_tables, "table", "name", table_name)
if table_object is None:
    liquibase_logger.warning(f "Table \"{table_name}\" not found in snapshot. Statement skipped.")
continue
# # #
# # # INSERT
# # #
if command_name == "insert":
    search_string = f "{table_name} ("
start = raw_statement.find(search_string)
# # #
# # # INSERT INTO TABLE VALUES(value1, value2, ...)
# # #
if start == -1:
    column_list_ids = [column_id.replace("liquibase.structure.core.Column#", "") for column_id in table_object["table"]["columns"]]
for column_id in column_list_ids:
    column_object = find_snapshot_object(all_columns, "column", "snapshotId", column_id)
if column_object is not None:
    column_dict[column_object["column"]["name"]] = column_object["column"]["type"]["typeName"].lower()
# # #
# # # INSERT INTO TABLE(column1, column2, ...) VALUES(value1, value2, ...)
# # #
else:
    start += len(search_string)
end = raw_statement.find(")", start)
if end != -1:
    column_list_names = parse_parameters(raw_statement[start: end])
for column_name in column_list_names:
    column_object = find_snapshot_object(all_columns, "column", "name", column_name)
if column_object is not None:
    column_dict[column_object["column"]["name"]] = column_object["column"]["type"]["typeName"].lower()
# # #
# # # Process data
# # #
search_string = "values ("
start = raw_statement.rfind(search_string)
if start != -1:
    start += len(search_string)
end = raw_statement.rfind(")")
if end != -1:
    data_list = parse_parameters(raw_statement[start: end])
# # #
# # # UPDATE
# # #
else:
    search_string = "set "
start = raw_statement.find(search_string)
# # #
# # # UPDATE TABLE SET column1 = value1, column2 = value2, ...
    # # #
if start != -1:
    start += len(search_string)
end = raw_statement.rfind("where")
if end == -1:
    end = None
combined_data = parse_parameters(raw_statement[start: end], ",=")
for index in range(len(combined_data)):
    if index % 2 == 0:
    column_object = find_snapshot_object(all_columns, "column", "name", combined_data[index])
if column_object is not None:
    column_dict[column_object["column"]["name"]] = column_object["column"]["type"]["typeName"].lower()
else:
    data_list.append(combined_data[index])
# # #
# # # Continue to next statement
if columns are empty or column / data counts don 't match
# # #
if len(column_dict) == 0 or len(column_dict) != len(data_list):
    liquibase_logger.warning("Column/data count mismatch. Statement skipped.")
continue
# # #
# # # Merge columns and data
# # #
merged_data = {}
for (key, value), data in zip(column_dict.items(), data_list):
    merged_data[key] = {
        "data": data,
        "type": value
    }
# # #
# # # Check
for numeric characters in varchar columns
# # #
for key in merged_data:
    if "varchar" in merged_data[key]["type"]:
    if not check_data(merged_data[key]["data"]):
    liquibase_status.fired = True
status_message = str(liquibase_utilities.get_script_message()).replace("__COLUMN_NAME__", f "\"{key}\"")
liquibase_status.message = status_message
sys.exit(1)

# # #
# # # Default
return code
# # #
False

Initiate the customization process

In the CLI, run this command:

liquibase checks customize --check-name=CustomCheckTemplate

The CLI prompts you to finish configuring your file. A message displays:

This check cannot be customized directly because one or more fields does not have a default value.

Liquibase will then create a copy of CustomCheckTemplate and initiate the customization workflow.

Give your check a short name so you can easily identify what Python script it is associated with

Use up to 64 alpha-numeric characters only.

In this example, we will name the check:

varcharDataIntegrity

Set the Severity to return a code of 0-4 when triggered.

These severity codes allow you to determine if the job moves forward or stops when this check triggers. Learn more here: Use Policy Checks in Automation: Severity and Exit Code options: 'INFO'=0, 'MINOR'=1, 'MAJOR'=2, 'CRITICAL'=3, 'BLOCKER'=4

Set the SCRIPT_DESCRIPTION

In this example, we will set the description to:

This script checks for numeric characters in VARCHAR columns.

Set the SCRIPT_SCOPE

In this example, we will set the scope to:

database: If your check looks for the presence of keys, indexes, or table name patterns in your database schema including Liquibase Tracking Tables. With this value, the check runs once for each database object.

Set the SCRIPT_MESSAGE

This message will display when the check is triggered. In this example we will use:

Inserting numeric data into column __COLUMN_NAME__ is not allowed. Resolve this data before proceeding.

Set the SCRIPT_PATH

This is the relative path where your script is stored in relation to the changelog specified in --changelog-file, whether it is stored locally or in a repository.

In this example, we will set the path to:

scripts/varchar-data-integrity.py.

This check does not require a SCRIPT_ARGUMENT, so leave this blank.

Set the REQUIRES_SNAPSHOT

If your script scope is changelog, set whether the check requires a database snapshot. Specify true if your check needs to inspect database objects.

If your script scope is database, Liquibase always takes a snapshot, so this prompt does not appear.

Note: The larger your database, the more performance impact a snapshot causes. If you cannot run a snapshot due to memory limitations, see Memory Limits of Inspecting Large Schemas.