Skip to main content

Overview

Wirekite supports Snowflake as a target data warehouse for:
  • Schema Loading - Create target tables from Wirekite’s intermediate schema format
  • Data Loading - Bulk load extracted data via Snowflake internal stage
  • Change Loading (CDC) - Apply ongoing changes using MERGE operations
Snowflake loaders stage data through Snowflake’s internal stage using PUT commands, then use COPY INTO for high-performance bulk loading.

Prerequisites

Before configuring Snowflake as a Wirekite target, ensure the following requirements are met:

Snowflake Configuration

  1. User Setup: Create a properly configured Snowflake user with appropriate privileges
  2. Warehouse: Ensure a virtual warehouse is available for loading operations
  3. Database & Schema: Create the target database and schema
  4. CLI Tools: Either install the SnowSQL command line tool or use the Wirekite cmdline tool

Internal Tables

Ensure the Wirekite target metadata tables (wirekite_progress and wirekite_action) exist in your Snowflake database and are read-write accessible. Use the Wirekite cmdline tool to verify connectivity.
For connection string format details, see the Snowflake connection documentation.

Schema Loader

The Schema Loader reads Wirekite’s intermediate schema format (.skt file) and generates Snowflake-appropriate DDL statements for creating target tables.
The Schema Loader generates both base tables and merge tables (with $wkm suffix) for CDC operations.

Required Parameters

schemaFile
string
required
Path to the Wirekite schema file (.skt) generated by the Schema Extractor. Must be an absolute path.
createTableFile
string
required
Output file for CREATE TABLE statements. Includes both base tables and merge tables for CDC operations.
createConstraintFile
string
required
Output file for CHECK constraints. Snowflake doesn’t enforce many constraints, so you may choose not to load this file.
createForeignKeyFile
string
required
Output file for FOREIGN KEY constraints. You may elect not to load foreign keys if you don’t need them.
logFile
string
required
Absolute path to the log file for Schema Loader operations.

Optional Parameters

dropTableFile
string
default:"none"
Output file for DROP TABLE IF EXISTS statements. Set to “none” to skip generation.
createRecoveryTablesFile
string
default:"none"
Output file for recovery table creation SQL. Set to “none” to skip.
createMergeTables
boolean
default:"true"
When true, generates merge tables ($wkm suffix) for CDC operations. Set to false if only doing data loads without change capture.

Data Mover

The Data Mover uploads extracted data files to Snowflake’s internal stage using PUT commands for subsequent loading.

Required Parameters

dsnFile
string
required
Path to a file containing the Snowflake connection string.
Connection string format (Golang connector):
<snowflake-user>:<snowflake-pwd>@<snowflake-orgname>-<snowflake-acct>/<database>/<schema>?warehouse=<warehouse-name>
Example:
wirekite:secretpass@myorg-myaccount/MYDB/PUBLIC?warehouse=COMPUTE_WH
dataDirectory
string
required
Directory where the Data Extractor wrote its files. These will be copied to the Snowflake staging area.
logFile
string
required
Absolute path to the log file for Data Mover operations.

Optional Parameters

maxThreads
integer
default:"10"
Maximum number of parallel threads for uploading to Snowflake stage.
gzipFiles
boolean
default:"false"
When true, compresses files before uploading. Changes extension to .dgz.
removeFiles
boolean
default:"false"
When true, deletes local files after successful PUT to stage. Should typically be true in production to save disk space.

Data Loader

The Data Loader reads data files from Snowflake’s internal stage and loads them into target tables using COPY INTO commands.

Required Parameters

dsnFile
string
required
Path to a file containing the Snowflake connection string.
Connection string format (Golang connector):
<snowflake-user>:<snowflake-pwd>@<snowflake-orgname>-<snowflake-acct>/<database>/<schema>?warehouse=<warehouse-name>
schemaFile
string
required
Path to the Wirekite schema file used by Schema Loader. Required for table structure information.
logFile
string
required
Absolute path to the log file for Data Loader operations.

Optional Parameters

maxThreads
integer
default:"5"
Maximum number of parallel COPY threads. We recommend setting this to the number of CPUs on the host.
hexEncoding
boolean
default:"false"
Set to true if data was extracted using hex encoding instead of base64.
dataFileExtension
string
default:"dkt"
File extension for data files to process (e.g., “dkt”, “dgz”).
For best performance, set maxThreads equal to the number of CPU cores available on the loader host.

Change Loader

The Change Loader applies ongoing data changes (INSERT, UPDATE, DELETE) to Snowflake tables using MERGE operations with shadow tables.
The Change Loader uses a merging approach that stages intermediate data to Snowflake’s internal stage before executing MERGE statements.

Required Parameters

dsnFile
string
required
Path to a file containing the Snowflake connection string.
Connection string format (Golang connector):
<snowflake-user>:<snowflake-pwd>@<snowflake-orgname>-<snowflake-acct>/<database>/<schema>?warehouse=<warehouse-name>
inputDirectory
string
required
Directory where the Change Extractor wrote its files. These will be sourced for changes.
workDirectory
string
required
Working directory for intermediate files that are uploaded to Snowflake stage during merge operations.
schemaFile
string
required
Path to the Wirekite schema file for table structure information.
logFile
string
required
Absolute path to the log file for Change Loader operations.

Optional Parameters

maxFilesPerBatch
integer
default:"30"
Maximum number of change files to process in a single batch before executing MERGE operations.
hexEncoding
boolean
default:"false"
Set to true if change data was extracted using hex encoding.
removeFiles
boolean
default:"true"
When true, removes change files from inputDirectory after fully processing. Should typically be true in production to save disk space.
The Change Loader should not start until the Data Loader has successfully completed the initial full load.

Orchestrator Configuration

When using the Wirekite Orchestrator, prefix parameters with mover., target.schema., target.data., or target.change.. Example orchestrator configuration for Snowflake target:
# Main configuration
source=postgres
target=snowflake

# Data mover
mover.dsnFile=/opt/wirekite/config/snowflake.dsn
mover.dataDirectory=/opt/wirekite/output/data
mover.logFile=/var/log/wirekite/data-mover.log
mover.maxThreads=10
mover.removeFiles=true

# Schema loading
target.schema.schemaFile=/opt/wirekite/output/schema/wirekite_schema.skt
target.schema.createTableFile=/opt/wirekite/output/schema/create_tables.sql
target.schema.createConstraintFile=/opt/wirekite/output/schema/constraints.sql
target.schema.createForeignKeyFile=/opt/wirekite/output/schema/foreign_keys.sql
target.schema.logFile=/var/log/wirekite/schema-loader.log

# Data loading
target.data.dsnFile=/opt/wirekite/config/snowflake.dsn
target.data.schemaFile=/opt/wirekite/output/schema/wirekite_schema.skt
target.data.logFile=/var/log/wirekite/data-loader.log
target.data.maxThreads=8

# Change loading (CDC)
target.change.dsnFile=/opt/wirekite/config/snowflake.dsn
target.change.inputDirectory=/opt/wirekite/output/changes
target.change.workDirectory=/opt/wirekite/work
target.change.schemaFile=/opt/wirekite/output/schema/wirekite_schema.skt
target.change.logFile=/var/log/wirekite/change-loader.log
target.change.maxFilesPerBatch=30
For complete Orchestrator documentation, see the Execution Guide.