Skip to main content

Orchestrator

Wirekite has a main binary called the orchestrator. The actual binary is called wirekite. This is a top level binary which calls all the other binaries to execute the task of data movement. Note that other binaries can be called independently also for testing purposes.

Orchestrator Configuration File

Orchestrator takes a configuration file to understand what needs to be done. The configuration file has the following rules.
  1. The configuration variables can be represented using the format variable=value.
  2. Empty lines are allowed.
  3. Comments are allowed. To comment just start the line with #.
  4. Multilevel variables are allowed using the format parent.child=value.

Variables

The configuration file has the following main variables and allowed values.
home
string
home is the Wirekite installation directory.
source
string
source is the database type from which we are extracting. The possible values are
  1. mysql
  2. oracle
  3. postgres
  4. sqlserver
target
string
target is the database type to which we are loading. The possible values are
  1. snowflake
  2. firebolt
  3. databricks
  4. bigquery
  5. spanner
  6. singlestore
  7. mysql
  8. postgres
  9. oracle
  10. sqlserver
log
string
log is the path to the orchestrator log file.

Modes

The mode of operation is not declared explicitly. Instead, the orchestrator detects the mode based on which configuration keys are present. If source.schema.* or target.schema.* keys exist, schema mode is enabled. If source.data.* or target.data.* keys exist, data mode is enabled. If source.change.* or target.change.* keys exist, change mode is enabled. Schema means that we are extracting the schema of the source database. The schema loader generates SQL files for the target database. Note that these SQL files are not automatically executed on the target database. This is done manually because we probably want to make some granular changes to the schema properties on the target database. Schema mode must run independently and cannot be combined with data or change modes. Data means we are extracting data from the source database and loading on the target database. The orchestrator runs the extractor, mover, and loader in parallel. Change means that we are extracting changes — Inserts, Updates, Deletes, Begins, Commits — from the source database to the target database. The orchestrator can run continuously, keeping the target in sync with the source. Data and change modes can be combined in a single configuration file. When combined, the orchestrator first completes the bulk data load, captures the source position, and then starts continuous change replication from that position. In data and change modes, Wirekite automatically creates two metadata tables in the target database: wirekite_progress for tracking migration state and crash recovery, and wirekite_action for pause/stop/resume control. See Operations for details on monitoring and controlling running migrations.

Source and Target Child Variables

Both source and target variables have child variables that include the mode as part of the hierarchy. The format is source.<mode>.<variable>=value and target.<mode>.<variable>=value. For example, source.data.dsnFile is the connection string file for the data extractor and target.schema.schemaFile is the input schema file for the schema loader. The specific variables available depend on the source and target database types. For example if the source is mysql then the child variables can be any of the variables listed in mysql variables. If the target is snowflake then the child variables can be any of the variables listed in snowflake variables.

Mover Variables

The mover is an optional component that transfers files between the source and target. It is configured separately using the mover.<variable>=value format. The mover does not have a mode prefix since the same mover handles both data and change files.

Bringing It All Together

Below are example orchestrator configuration files for each mode.

Schema Mode Example

Schema mode extracts the database schema from the source and generates SQL files for the target.
# Schema migration configuration

# main
home=/opt/wirekite
source=mysql
target=snowflake
log=/var/log/wirekite/schema-migration.log

# schema source
source.schema.tablesFile=/opt/wirekite/config/tables.txt
source.schema.outputDirectory=/opt/wirekite/output/schema
source.schema.dsnFile=/opt/wirekite/config/mysql-source.dsn
source.schema.logFile=/var/log/wirekite/schema-extractor.log

# schema target
target.schema.schemaFile=/opt/wirekite/output/schema/wirekite_schema.skt
target.schema.createTableFile=/opt/wirekite/output/sql/create-tables.sql
target.schema.createConstraintFile=/opt/wirekite/output/sql/create-constraints.sql
target.schema.createForeignKeyFile=/opt/wirekite/output/sql/create-foreign-keys.sql
target.schema.dropTableFile=/opt/wirekite/output/sql/drop-tables.sql
target.schema.logFile=/var/log/wirekite/schema-loader.log
target.schema.createMergeTables=true

Data Mode Example

Data mode performs a one-time bulk data transfer from source to target.
# Data migration configuration

# main
home=/opt/wirekite
source=oracle
target=snowflake
log=/var/log/wirekite/data-migration.log

# data source
source.data.dsnFile=/opt/wirekite/config/oracle-source.dsn
source.data.tablesFile=/opt/wirekite/config/tables.txt
source.data.outputDirectory=/opt/wirekite/data/extract
source.data.logFile=/var/log/wirekite/data-extractor.log
source.data.maxThreads=8
source.data.maxRowsPerDump=50000

# data target
target.data.dsnFile=/opt/wirekite/config/snowflake-target.dsn
target.data.dataDirectory=/opt/wirekite/data/stage
target.data.doneDirectory=/opt/wirekite/data/done
target.data.logFile=/var/log/wirekite/data-loader.log
target.data.schemaFile=/opt/wirekite/output/schema/wirekite_schema.skt
target.data.dataFileExtension=dkt
target.data.maxThreads=4

# mover
mover.awsRegion=us-east-1
mover.awsBucket=wirekite-migration-bucket
mover.dataDirectory=/opt/wirekite/data/extract
mover.logFile=/var/log/wirekite/data-mover.log
mover.maxThreads=4
mover.gzipFiles=false
mover.removeFiles=false

Change Mode Example

Change mode captures ongoing changes (CDC) and applies them to the target. When running change mode independently you must specify the starting position for your source database type.
# Change replication configuration

# main
home=/opt/wirekite
source=mysql
target=postgres
log=/var/log/wirekite/change-replication.log

# change source
source.change.dsnFile=/opt/wirekite/config/mysql-source.dsn
source.change.tablesFile=/opt/wirekite/config/tables.txt
source.change.outputDirectory=/opt/wirekite/changes/extract
source.change.logFile=/var/log/wirekite/change-extractor.log
source.change.binlogFile=mysql-bin.000042
source.change.binlogPosition=154
source.change.exitWhenIdle=false

# change target
target.change.dsnFile=/opt/wirekite/config/postgres-target.dsn
target.change.dataDirectory=/opt/wirekite/changes/stage
target.change.workDirectory=/opt/wirekite/changes/work
target.change.logFile=/var/log/wirekite/change-loader.log
target.change.schemaFile=/opt/wirekite/output/schema/wirekite_schema.skt
target.change.changeFileExtension=ckt
target.change.doneDirectory=/opt/wirekite/changes/done
target.change.removeFiles=true
target.change.maxFilesPerBatch=30

Data + Change Mode Example

Data and change modes can be combined in a single configuration file. The orchestrator will first complete the bulk data load, capture the source position automatically, and then start continuous change replication. Do not specify a starting position for the change source — the orchestrator captures this automatically during the data phase.
# Full migration configuration (data + change)

# main
home=/opt/wirekite
source=sqlserver
target=snowflake
log=/var/log/wirekite/full-migration.log

# data source
source.data.dsnFile=/opt/wirekite/config/sqlserver-source.dsn
source.data.tablesFile=/opt/wirekite/config/tables.txt
source.data.outputDirectory=/opt/wirekite/data/extract
source.data.logFile=/var/log/wirekite/data-extractor.log
source.data.maxThreads=8
source.data.maxRowsPerDump=100000

# data target
target.data.dsnFile=/opt/wirekite/config/snowflake-target.dsn
target.data.dataDirectory=/opt/wirekite/data/stage
target.data.doneDirectory=/opt/wirekite/data/done
target.data.logFile=/var/log/wirekite/data-loader.log
target.data.schemaFile=/opt/wirekite/output/schema/wirekite_schema.skt
target.data.dataFileExtension=dkt
target.data.maxThreads=6

# change source
source.change.dsnFile=/opt/wirekite/config/sqlserver-source.dsn
source.change.tablesFile=/opt/wirekite/config/tables.txt
source.change.outputDirectory=/opt/wirekite/changes/extract
source.change.logFile=/var/log/wirekite/change-extractor.log
source.change.exitWhenIdle=false

# change target
target.change.dsnFile=/opt/wirekite/config/snowflake-target.dsn
target.change.dataDirectory=/opt/wirekite/changes/stage
target.change.workDirectory=/opt/wirekite/changes/work
target.change.logFile=/var/log/wirekite/change-loader.log
target.change.schemaFile=/opt/wirekite/output/schema/wirekite_schema.skt
target.change.changeFileExtension=ckt
target.change.doneDirectory=/opt/wirekite/changes/done
target.change.removeFiles=true
target.change.maxFilesPerBatch=30

# mover
mover.awsRegion=us-west-2
mover.awsBucket=wirekite-migration-bucket
mover.dataDirectory=/opt/wirekite/data/extract
mover.logFile=/var/log/wirekite/mover.log
mover.maxThreads=4
mover.gzipFiles=false
mover.removeFiles=false