MongoDB - Wirekite Docs

Overview

Wirekite supports MongoDB as a target database for:

Schema Loading - Create collections with optional JSON Schema validators
Data Loading - Bulk load extracted data using InsertMany operations
Change Loading (CDC) - Apply ongoing changes using BulkWrite operations

MongoDB loaders convert relational table data to BSON documents. Primary keys are mapped to MongoDB’s _id field. Composite primary keys are stored as nested BSON documents (e.g., {_id: {col1: val1, col2: val2}}).

Prerequisites

Before configuring MongoDB as a Wirekite target, ensure the following requirements are met:

Database Configuration

Version: MongoDB 4.x or above
User Permissions: The connection user must have:
- readWrite role on the target database

Limitations

MongoDB does not support FOREIGN KEY or CHECK constraints. These are skipped during schema loading with a warning in the log.

Schema Loader

The Schema Loader reads Wirekite’s intermediate schema format (.skt file) and emits two mongo-shell script files — one with db.<collection>.drop() statements and one with db.createCollection("<schema.table>") statements. The orchestrator runs these scripts against the target during the schema-apply phase; the schema loader itself does not connect to MongoDB.

Collection names follow schema.tablename format (e.g., public.users). When the orchestrator applies the create script, createCollection is a no-op if the collection already exists.

Required Parameters

schemaFile

string

required

Path to the Wirekite schema file (.skt) generated by the Schema Extractor.

logFile

string

required

Absolute path to the log file for Schema Loader operations.

dropTableFile

string

required

Output file for db.<collection>.drop() statements (one line per table). The orchestrator runs this script against the target before re-creating collections, only when re-applying schema.

createTableFile

string

required

Output file for db.createCollection("<schema.table>") statements (one line per table). The orchestrator runs this script against the target during schema apply.

Data Loader

The Data Loader reads Wirekite’s intermediate data format (.dkt files) and loads documents into MongoDB collections using InsertMany with unordered batches for maximum throughput.

The Data Loader uses a 3-stage pipeline architecture (Scanner, Parsers, Writers) for high-performance parallel loading.

Required Parameters

dsnFile

string

required

Path to a file containing the MongoDB connection string.

inputDirectory

string

required

Directory containing data files (.dkt) to load.

schemaFile

string

required

Path to the Wirekite schema file for table structure information.

logFile

string

required

Absolute path to the log file for Data Loader operations.

Optional Parameters

maxThreads

integer

default:"10"

Maximum number of parallel threads for loading files. Each thread loads one file at a time.

hexEncoding

boolean

default:"false"

Set to true if data was extracted using hex encoding instead of base64.

Change Loader

The Change Loader applies ongoing data changes (INSERT, UPDATE, DELETE) to MongoDB collections using BulkWrite operations.

Updates use sparse $set operations, only modifying changed fields. Inserts and replaces use full document upserts.

Required Parameters

dsnFile

string

required

Path to a file containing the MongoDB connection string.

inputDirectory

string

required

Directory containing change files (.ckt) from the Change Extractor.

schemaFile

string

required

Path to the Wirekite schema file for table structure information.

logFile

string

required

Absolute path to the log file for Change Loader operations.

Optional Parameters

hexEncoding

boolean

default:"false"

Set to true if change data was extracted using hex encoding.

The Change Loader should not start until the Data Loader has successfully completed the initial full load.

Orchestrator Configuration

When using the Wirekite Orchestrator, prefix target parameters with target.schema., target.data., or target.change. depending on the operation. Example orchestrator configuration for MongoDB target:

# Main configuration
source=postgres
target=mongodb

# Schema loading
target.schema.schemaFile=/opt/wirekite/output/schema/wirekite_schema.skt
target.schema.dropTableFile=/opt/wirekite/output/schema/drop-collections.js
target.schema.createTableFile=/opt/wirekite/output/schema/create-collections.js
target.schema.logFile=/var/log/wirekite/schema-loader.log

# Data loading
target.data.dsnFile=/opt/wirekite/config/mongodb.dsn
target.data.inputDirectory=/opt/wirekite/output/data
target.data.schemaFile=/opt/wirekite/output/schema/wirekite_schema.skt
target.data.logFile=/var/log/wirekite/data-loader.log
target.data.maxThreads=8

# Change loading (CDC)
target.change.dsnFile=/opt/wirekite/config/mongodb.dsn
target.change.inputDirectory=/opt/wirekite/output/changes
target.change.schemaFile=/opt/wirekite/output/schema/wirekite_schema.skt
target.change.logFile=/var/log/wirekite/change-loader.log

For complete Orchestrator documentation, see the Execution Guide.

​Overview

​Prerequisites

​Database Configuration

​Limitations

​Schema Loader

​Required Parameters

​Data Loader

​Required Parameters

​Optional Parameters

​Change Loader

​Required Parameters

​Optional Parameters

​Orchestrator Configuration

Overview

Prerequisites

Database Configuration

Limitations

Schema Loader

Required Parameters

Data Loader

Required Parameters

Optional Parameters

Change Loader

Required Parameters

Optional Parameters

Orchestrator Configuration