Data Validation

Overview

The TableValidator tool compares data between source and target databases after a migration to verify accuracy. It performs row-by-row comparison using primary key ordering and reports missing rows, extra rows, and value differences.

TableValidator handles database-specific differences automatically, including timestamp normalization, float precision tolerance, trailing space trimming, and NULL/empty string semantics.

Usage

TableValidator takes a configuration file as its only argument:

/opt/wirekite/tablevalidator/tablevalidator /path/to/validation.cfg

The configuration file uses the same key=value format as the orchestrator.

Required Parameters

sourceType

string

required

Source database type. Supported values: mysql, postgres, oracle, sqlserver, singlestore.

sourceDsnFile

string

required

Path to a file containing the source database connection string.

targetType

string

required

Target database type. Supported values: mysql, postgres, oracle, sqlserver, snowflake, firebolt, bigquery, databricks, spanner, singlestore.

targetDsnFile

string

required

Path to a file containing the target database connection string.

schemaFile

string

required

Path to the Wirekite schema file (.skt) generated by the Schema Extractor. Used to determine table structures, column types, and primary keys.

tablesFile

string

required

Path to a file listing the tables to validate, one per line in schema.table format.

Optional Parameters

logFile

string

Path to the log output file. If not specified, output is written to stdout.

schemaRename

string

Schema mapping for source-to-target name differences. Format: sourceSchema:targetSchema. Use commas for multiple mappings: schema1:mapped1,schema2:mapped2.

maxThreads

integer

Number of parallel validation threads. Defaults to 2x the number of CPU cores.

windowSize

integer

default:"100000"

Number of rows per pagination window. The validator uses keyset pagination to compare large tables in chunks of this size.

stopOnError

boolean

default:"false"

When true, stops validation immediately when any table has differences. When false, continues validating all tables and reports all differences at the end.

emptyEqualsNull

boolean

default:"false"

When true, treats empty strings and NULL values as equivalent. Useful when migrating from Oracle (which treats empty strings as NULL) to other databases.

Partial Validation

For large databases, you can validate a random sample instead of all rows:

samplePercent

integer

default:"0"

Percentage of rows to validate (1-100). When set to 0 (default), all rows are validated.

sampleRows

integer

default:"0"

Maximum number of rows to validate. When set to 0 (default), all rows are validated. If both samplePercent and sampleRows are set, the smaller sample is used.

randomizeWindows

boolean

default:"true"

When true, randomly selects which windows to validate during partial validation. When false, validates windows sequentially from the beginning.

Example Configuration

# Validation configuration

# Source
sourceType=postgres
sourceDsnFile=/opt/wirekite/config/postgres-source.dsn

# Target
targetType=snowflake
targetDsnFile=/opt/wirekite/config/snowflake-target.dsn

# Schema and tables
schemaFile=/opt/wirekite/output/schema/wirekite_schema.skt
tablesFile=/opt/wirekite/config/tables.txt

# Options
logFile=/var/log/wirekite/validation.log
maxThreads=8
stopOnError=false
emptyEqualsNull=false

Partial Validation Example

# Validate 25% of rows randomly
sourceType=oracle
sourceDsnFile=/opt/wirekite/config/oracle-source.dsn
targetType=bigquery
targetDsnFile=/opt/wirekite/config/bigquery-target.dsn
schemaFile=/opt/wirekite/output/schema/wirekite_schema.skt
tablesFile=/opt/wirekite/config/tables.txt
logFile=/var/log/wirekite/validation.log
samplePercent=25
randomizeWindows=true

Output

TableValidator reports results per table and provides a summary at the end.

Per-Table Results

For each table, the validator reports:

Source and target row counts
Number of rows missing in target
Number of extra rows in target
Number of rows with value differences
Sample rows for each category (up to 10 examples)

Summary Report

The summary shows totals across all tables:

Total tables validated (passed, failed, errors)
Total rows compared
Total matches, missing, extra, and differences

Exit Codes

Code	Meaning
0	All tables passed validation
1	One or more tables have differences or errors

Comparison Behavior

TableValidator automatically handles common cross-database differences:

Category	Handling
Timestamps	Normalized to UTC for comparison
Floats	Compared with relative tolerance (1e-5)
Decimals	Arbitrary precision comparison for large numbers
CHAR padding	Trailing spaces trimmed
NULL vs empty	Configurable via `emptyEqualsNull`
JSON	Semantic equality (ignores formatting)
UUID	Case-insensitive comparison
Money	Currency symbol and separator normalization

You can also run validation through the Web Interface under the Validate tab of a migration.

Introduction

Datatype Matrices

Source Guides

Data Movers

Target Guides

Running Wirekite

Tools

Overview

Usage

Required Parameters

Optional Parameters

Partial Validation

Example Configuration

Partial Validation Example

Output

Per-Table Results

Summary Report

Exit Codes

Comparison Behavior

Introduction

Datatype Matrices

Source Guides

Data Movers

Target Guides

Running Wirekite

Tools

​Overview

​Usage

​Required Parameters

​Optional Parameters

​Partial Validation

​Example Configuration

​Partial Validation Example

​Output

​Per-Table Results

​Summary Report

​Exit Codes

​Comparison Behavior

Overview

Usage

Required Parameters

Optional Parameters

Partial Validation

Example Configuration

Partial Validation Example

Output

Per-Table Results

Summary Report

Exit Codes

Comparison Behavior