Documentation Index
Fetch the complete documentation index at: https://docs.wirekite.io/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Wirekite’s Change Data Capture (CDC) replication keeps a target database synchronized with ongoing changes to the source. The system extracts changes (inserts, updates, deletes) from the source database’s transaction log, writes them to intermediate.ckt files, and applies them to the target using merge tables.
CDC can run independently or combined with a bulk data load. When combined, the orchestrator automatically handles the transition from data migration to continuous replication.
Handover: Data to CDC Transition
When data and change modes are combined in a single configuration, the orchestrator performs an automatic handover after the bulk data load completes. This captures the exact source position so that change replication begins without missing or duplicating any rows. The handover is a 4-step process:Position Capture by Database
Each database type uses a different mechanism to identify the current position in its transaction log:| Database | Position Type | Capture Method |
|---|---|---|
| MySQL | Binlog file + byte offset | SHOW MASTER STATUS followed by FLUSH BINARY LOGS |
| PostgreSQL | WAL Log Sequence Number (LSN) | pg_current_wal_lsn() plus creation of a logical replication slot |
| Oracle | System Change Number (SCN) | SELECT CURRENT_SCN FROM V$DATABASE followed by ALTER SYSTEM ARCHIVE LOG CURRENT |
| SQL Server | Log Sequence Number (LSN) | sys.fn_cdc_get_max_lsn() via CDC, or HADR/transaction log fallback |
When running data + change mode together, do not specify a starting position in the change source configuration. The orchestrator captures and injects the position automatically during handover.
Change Extraction
The change extractor connects to the source database’s transaction log and writes changes to.ckt files in the output directory.
Continuous Replication Loop
The extractor runs in a continuous loop:- Wait for new events from the transaction log (with timeout)
- Process each event (write INSERT, UPDATE, or DELETE to the current
.cktfile) - When the operation count reaches the rotation threshold or a time-based flush triggers, rotate to a new file at the next COMMIT boundary
- Record the new file in the
wirekite_progresstable - Repeat
File Rotation
The extractor rotates to a new.ckt file when either condition is met:
- Operation count: PostgreSQL rotates after approximately 150,000 data operations. MySQL rotates based on
flushIntervalSeconds(time-based) or binlog rotation. Oracle and SQL Server rotate per archive log or LSN range. - Time-based flush: Buffered data is flushed every few seconds to prevent loss on crash
- Flush buffered data to the temporary file (
N.ckt_) - Close the file
- Rename
N.ckt_toN.ckt(atomic filesystem operation) - Record the file in
wirekite_progress - Open the next file (
N+1.ckt_)
exitWhenIdle
When true, the change extractor exits after a period of inactivity (controlled by
idleWaitSeconds). On exit, it creates a CHANGE.DONE marker file in the output directory. Useful for one-time sync scenarios where continuous replication is not needed.exitWhenIdle is false (the default), the extractor runs indefinitely until stopped via the wirekite_action table or process termination.
Transaction Consistency
Wirekite maintains transaction ordering through several mechanisms:- Single-threaded extraction: The change extractor processes the transaction log sequentially, preserving the original commit order
- File rotation at COMMIT boundaries: Files never split a transaction, so each
.cktfile contains only complete transactions - Ordered file processing: The change loader processes files in sequential numeric order (0.ckt, 1.ckt, 2.ckt, …)
- Multi-threaded merge: Within each batch, the change loader can apply changes to multiple tables concurrently using merge threads (controlled by
maxMergeThreads), while still preserving per-table ordering - Atomic merge operations: Each batch of changes is applied within a single transaction on the target
processCommits parameter controls whether BEGIN (B) and COMMIT (C) markers are written to .ckt files. When enabled, the change loader can use these markers to understand transaction boundaries during application.
Crash Recovery
CDC replication recovers automatically from crashes using thewirekite_progress table.
Change Extractor Recovery
On restart, the change extractor:- Queries
wirekite_progressfor the latestCE(Change Extract) record - If the record has
finish_time IS NULL, the last file was incomplete — resumes from themarkposition (binlog position, LSN, or SCN) - If a temporary file (
.ckt_) exists, it is either renamed or discarded based on the recovery state - Continues writing from the next file number
Change Loader Recovery
On restart, the change loader:- Scans the input directory for
.cktfiles - Queries
wirekite_progressfor the last processedC(Change Load) record - Skips files that have already been fully processed (
finish_time IS NOT NULL) - Resumes from the next unprocessed file
Monitoring Replication
Progress Queries
Check the latest extractor position:Measuring Lag
Replication lag is the difference between the current time and the timestamp of the latest change file processed by the loader:Source-Side Lag Queries
Monitor the source database to understand how far behind extraction is: MySQL:Configuration Reference
Key parameters for CDC replication. For the complete parameter list, see the individual Source Guides and Target Guides.Change Extractor Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
exitWhenIdle | boolean | false | Exit after idle timeout instead of running continuously |
idleWaitSeconds | integer | varies | Seconds of inactivity before considering idle. Defaults: MySQL=2, PostgreSQL=3, SQL Server=10, Oracle=30 |
flushIntervalSeconds | integer | 5 | Force flush buffered data every N seconds (MySQL and Oracle only) |
eventsPerFlush | integer | 5000 | Flush after N events (MySQL and Oracle only). PostgreSQL uses a fixed rotation threshold of ~150,000 operations |
processCommits | boolean | false | Include BEGIN/COMMIT markers in .ckt files |
binlogFile | string | - | MySQL starting binlog filename (standalone change mode only) |
binlogPosition | integer | - | MySQL starting byte offset (standalone change mode only) |
startScn | string | - | Oracle starting SCN (standalone change mode only) |
startLsn | string | - | SQL Server starting LSN (standalone change mode only) |
replicationSlot | string | - | PostgreSQL replication slot name (required for PostgreSQL CDC) |
lsnBatchSize | integer | 1000000 | Number of LSNs per batch (SQL Server only) |
Change Loader Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
maxFilesPerBatch | integer | 60 | Number of .ckt files per merge batch (30 for Spanner and SingleStore) |
maxMergeThreads | integer | 2 x CPU cores | Number of parallel threads for applying merge operations within each batch. Applies to Snowflake, BigQuery, Firebolt, and Databricks targets. MySQL, PostgreSQL, Oracle, and SQL Server targets use a fixed maximum of 10. |
removeFiles | boolean | false | Delete .ckt files after successful loading |
changeFileExtension | string | ckt | Extension for change files |
workDirectory | string | - | Working directory for merge operations |
doneDirectory | string | - | Directory to move completed files (alternative to removeFiles) |
