> ## Documentation Index
> Fetch the complete documentation index at: https://docs.wirekite.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Data Validation

> Validate migration accuracy by comparing source and target data row-by-row.

## Overview

The TableValidator tool compares data between source and target databases after a migration to verify accuracy. It performs row-by-row comparison using primary key ordering and reports missing rows, extra rows, and value differences.

<Note>
  TableValidator handles database-specific differences automatically, including timestamp normalization, float precision tolerance, trailing space trimming, and NULL/empty string semantics.
</Note>

## Usage

TableValidator takes a configuration file as its only argument:

```bash theme={null}
/opt/wirekite/tablevalidator/tablevalidator /path/to/validation.cfg
```

The configuration file uses the same `key=value` format as the orchestrator.

## Required Parameters

<ResponseField name="sourceType" type="string" required>
  Source database type. Supported values: `mysql`, `mariadb`, `postgres`, `oracle`, `sqlserver`, `singlestore`. (`mariadb` is accepted as an alias and handled identically to `mysql`.)
</ResponseField>

<ResponseField name="sourceDsnFile" type="string" required>
  Path to a file containing the source database connection string.
</ResponseField>

<ResponseField name="targetType" type="string" required>
  Target database type. Supported values: `mysql`, `mariadb`, `postgres`, `oracle`, `sqlserver`, `snowflake`, `firebolt`, `bigquery`, `databricks`, `spanner`, `singlestore`, `mongodb`. (`mariadb` is accepted as an alias and handled identically to `mysql`.)
</ResponseField>

<ResponseField name="targetDsnFile" type="string" required>
  Path to a file containing the target database connection string.
</ResponseField>

<ResponseField name="schemaFile" type="string" required>
  Path to the Wirekite schema file (`.skt`) generated by the Schema Extractor. Used to determine table structures, column types, and primary keys.
</ResponseField>

<ResponseField name="tablesFile" type="string" required>
  Path to a file listing the tables to validate, one per line in `schema.table` format.
</ResponseField>

## Optional Parameters

<ResponseField name="logFile" type="string">
  Path to the log output file. If not specified, output is written to stdout.
</ResponseField>

<ResponseField name="schemaRename" type="string">
  Schema mapping for source-to-target name differences. Format: `sourceSchema:targetSchema`. Use commas for multiple mappings: `schema1:mapped1,schema2:mapped2`.
</ResponseField>

<ResponseField name="maxThreads" type="integer">
  Number of parallel validation threads. Defaults to 2x the number of CPU cores.
</ResponseField>

<ResponseField name="gcpCredfile" type="string">
  Path to a GCP service account JSON credentials file. Required when validating against Spanner or BigQuery targets.
</ResponseField>

<ResponseField name="windowSize" type="integer" default="100000">
  Number of rows per pagination window. The validator uses keyset pagination to compare large tables in chunks of this size.
</ResponseField>

<ResponseField name="stopOnError" type="boolean" default="false">
  When `true`, stops validation immediately when any table has differences. When `false`, continues validating all tables and reports all differences at the end.
</ResponseField>

<ResponseField name="emptyEqualsNull" type="boolean" default="false">
  When `true`, treats empty strings and NULL values as equivalent. Useful when migrating from Oracle (which treats empty strings as NULL) to other databases.
</ResponseField>

### Partial Validation

For large databases, you can validate a random sample instead of all rows:

<ResponseField name="samplePercent" type="integer" default="0">
  Percentage of rows to validate (1-100). When set to 0 (default), all rows are validated.
</ResponseField>

<ResponseField name="sampleRows" type="integer" default="0">
  Maximum number of rows to validate. When set to 0 (default), all rows are validated. If both `samplePercent` and `sampleRows` are set, the smaller sample is used.
</ResponseField>

<ResponseField name="randomizeWindows" type="boolean" default="true">
  When `true`, randomly selects which windows to validate during partial validation. When `false`, validates windows sequentially from the beginning.
</ResponseField>

## Example Configuration

```shellscript theme={null}
# Validation configuration

# Source
sourceType=postgres
sourceDsnFile=/opt/wirekite/config/postgres-source.dsn

# Target
targetType=snowflake
targetDsnFile=/opt/wirekite/config/snowflake-target.dsn

# Schema and tables
schemaFile=/opt/wirekite/output/schema/wirekite_schema.skt
tablesFile=/opt/wirekite/config/tables.txt

# Options
logFile=/var/log/wirekite/validation.log
maxThreads=8
stopOnError=false
emptyEqualsNull=false
```

### Partial Validation Example

```shellscript theme={null}
# Validate 25% of rows randomly
sourceType=oracle
sourceDsnFile=/opt/wirekite/config/oracle-source.dsn
targetType=bigquery
targetDsnFile=/opt/wirekite/config/bigquery-target.dsn
schemaFile=/opt/wirekite/output/schema/wirekite_schema.skt
tablesFile=/opt/wirekite/config/tables.txt
logFile=/var/log/wirekite/validation.log
samplePercent=25
randomizeWindows=true
```

## Output

TableValidator reports results per table and provides a summary at the end.

### Per-Table Results

For each table, the validator reports:

* Source and target row counts
* Number of rows missing in target
* Number of extra rows in target
* Number of rows with value differences
* Sample rows for each category (up to 10 examples)

### Summary Report

The summary shows totals across all tables:

* Total tables validated (passed, failed, errors)
* Total rows compared
* Total matches, missing, extra, and differences

### Exit Codes

| Code | Meaning                                       |
| ---- | --------------------------------------------- |
| 0    | All tables passed validation                  |
| 1    | One or more tables have differences or errors |

## Comparison Behavior

TableValidator automatically handles common cross-database differences:

| Category      | Handling                                         |
| ------------- | ------------------------------------------------ |
| Timestamps    | Normalized to UTC for comparison                 |
| Floats        | Compared with relative tolerance (1e-5)          |
| Decimals      | Arbitrary precision comparison for large numbers |
| CHAR padding  | Trailing spaces trimmed                          |
| NULL vs empty | Configurable via `emptyEqualsNull`               |
| JSON          | Semantic equality (ignores formatting)           |
| UUID          | Case-insensitive comparison                      |
| Money         | Currency symbol and separator normalization      |

<Tip>
  You can also run validation through the [Web Interface](/run/execution/ux) under the Validate tab of a migration.
</Tip>
