CDC Replication

Overview

Wirekite’s Change Data Capture (CDC) replication keeps a target database synchronized with ongoing changes to the source. The system extracts changes (inserts, updates, deletes) from the source database’s transaction log, writes them to intermediate .ckt files, and applies them to the target using merge tables. CDC can run independently or combined with a bulk data load. When combined, the orchestrator automatically handles the transition from data migration to continuous replication.

Handover: Data to CDC Transition

When data and change modes are combined in a single configuration, the orchestrator performs an automatic handover after the bulk data load completes. This captures the exact source position so that change replication begins without missing or duplicating any rows. The handover is a 4-step process:

Stop Replication

Pause replication on the source to freeze the position.

Capture Position

Record the current position in the source transaction log.

Start Change Processes

Launch the change extractor and loader with the captured position.

Resume Replication

Resume replication on the source. The change extractor now captures all new events from the saved position forward.

Position Capture by Database

Each database type uses a different mechanism to identify the current position in its transaction log:

Database	Position Type	Capture Method
MySQL	Binlog file + byte offset	`SHOW MASTER STATUS` followed by `FLUSH BINARY LOGS`
PostgreSQL	WAL Log Sequence Number (LSN)	`pg_current_wal_lsn()` plus creation of a logical replication slot
Oracle	System Change Number (SCN)	`SELECT CURRENT_SCN FROM V$DATABASE` followed by `ALTER SYSTEM ARCHIVE LOG CURRENT`
SQL Server	Log Sequence Number (LSN)	`sys.fn_cdc_get_max_lsn()` via CDC, or HADR/transaction log fallback

When running data + change mode together, do not specify a starting position in the change source configuration. The orchestrator captures and injects the position automatically during handover.

Change Extraction

The change extractor connects to the source database’s transaction log and writes changes to .ckt files in the output directory.

Continuous Replication Loop

The extractor runs in a continuous loop:

Wait for new events from the transaction log (with timeout)
Process each event (write INSERT, UPDATE, or DELETE to the current .ckt file)
When the operation count reaches the rotation threshold or a time-based flush triggers, rotate to a new file at the next COMMIT boundary
Record the new file in the wirekite_progress table
Repeat

File Rotation

The extractor rotates to a new .ckt file when either condition is met:

Operation count: After approximately 150,000 data operations (inserts, updates, deletes)
Time-based flush: Buffered data is flushed every few seconds to prevent loss on crash

Rotation always occurs at a transaction COMMIT boundary to ensure no partial transactions span files. The rotation process is atomic:

Flush buffered data to the temporary file (N.ckt_)
Close the file
Rename N.ckt_ to N.ckt (atomic filesystem operation)
Record the file in wirekite_progress
Open the next file (N+1.ckt_)

exitWhenIdle

boolean

default:"false"

When true, the change extractor exits after a period of inactivity (controlled by idleWaitSeconds). On exit, it creates a CHANGE.DONE marker file in the output directory. Useful for one-time sync scenarios where continuous replication is not needed.

When exitWhenIdle is false (the default), the extractor runs indefinitely until stopped via the wirekite_action table or process termination.

Transaction Consistency

Wirekite maintains transaction ordering through several mechanisms:

Single-threaded extraction: The change extractor processes the transaction log sequentially, preserving the original commit order
File rotation at COMMIT boundaries: Files never split a transaction, so each .ckt file contains only complete transactions
Ordered file processing: The change loader processes files in sequential numeric order (0.ckt, 1.ckt, 2.ckt, …)
Atomic merge operations: Each batch of changes is applied within a single transaction on the target

The processCommits parameter controls whether BEGIN (B) and COMMIT (C) markers are written to .ckt files. When enabled, the change loader can use these markers to understand transaction boundaries during application.

Crash Recovery

CDC replication recovers automatically from crashes using the wirekite_progress table.

Change Extractor Recovery

On restart, the change extractor:

Queries wirekite_progress for the latest CE (Change Extract) record
If the record has finish_time IS NULL, the last file was incomplete — resumes from the mark position (binlog position, LSN, or SCN)
If a temporary file (.ckt_) exists, it is either renamed or discarded based on the recovery state
Continues writing from the next file number

Change Loader Recovery

On restart, the change loader:

Scans the input directory for .ckt files
Queries wirekite_progress for the last processed CL (Change Load) record
Skips files that have already been fully processed (finish_time IS NOT NULL)
Resumes from the next unprocessed file

Each batch starts with a clean state, so partial batches from a previous crash are safely discarded and reprocessed.

Monitoring Replication

Progress Queries

Check the latest extractor position:

SELECT source, mark, start_time, finish_time
FROM wirekite.wirekite_progress
WHERE operation = 'CE'
ORDER BY start_time DESC
LIMIT 5;

Check the latest loader position:

SELECT source, mark, start_time, finish_time
FROM wirekite.wirekite_progress
WHERE operation = 'CL'
ORDER BY start_time DESC
LIMIT 5;

Measuring Lag

Replication lag is the difference between the current time and the timestamp of the latest change file processed by the loader:

SELECT TIMESTAMPDIFF(SECOND, MAX(finish_time), NOW()) AS lag_seconds
FROM wirekite.wirekite_progress
WHERE operation = 'CL' AND finish_time IS NOT NULL;

You can also monitor lag through the Web Interface, which displays real-time change lag on the replication tab.

Source-Side Lag Queries

Monitor the source database to understand how far behind extraction is: MySQL:

SHOW MASTER STATUS;
-- Compare current position with extractor's last recorded position

PostgreSQL:

SELECT slot_name,
       pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn) AS lag_bytes
FROM pg_replication_slots
WHERE slot_name = 'wirekite_replication_slot';

Configuration Reference

Key parameters for CDC replication. For the complete parameter list, see the individual Source Guides and Target Guides.

Change Extractor Parameters

Parameter	Type	Default	Description
`exitWhenIdle`	boolean	false	Exit after idle timeout instead of running continuously
`idleWaitSeconds`	integer	2-3	Seconds of inactivity before considering idle
`flushIntervalSeconds`	integer	5	Force flush buffered data every N seconds
`eventsPerFlush`	integer	5000	Flush after N events (MySQL)
`processCommits`	boolean	false	Include BEGIN/COMMIT markers in .ckt files
`binlogFile`	string	-	MySQL starting binlog filename (standalone change mode only)
`binlogPosition`	integer	-	MySQL starting byte offset (standalone change mode only)
`startScn`	string	-	Oracle starting SCN (standalone change mode only)
`startLsn`	string	-	SQL Server starting LSN (standalone change mode only)
`replicationSlot`	string	wirekite_replication_slot	PostgreSQL replication slot name

Change Loader Parameters

Parameter	Type	Default	Description
`maxFilesPerBatch`	integer	30	Number of .ckt files per merge batch
`maxThreads`	integer	5	Parallel threads for change loading
`removeFiles`	boolean	false	Delete .ckt files after successful loading
`changeFileExtension`	string	ckt	Extension for change files
`workDirectory`	string	-	Working directory for merge operations
`doneDirectory`	string	-	Directory to move completed files (alternative to removeFiles)

Introduction

Datatype Matrices

Source Guides

Data Movers

Target Guides

Running Wirekite

Tools

Overview

Handover: Data to CDC Transition

Position Capture by Database

Change Extraction

Continuous Replication Loop

File Rotation

exitWhenIdle

Transaction Consistency

Crash Recovery

Change Extractor Recovery

Change Loader Recovery

Monitoring Replication

Progress Queries

Measuring Lag

Source-Side Lag Queries

Configuration Reference

Change Extractor Parameters

Change Loader Parameters

Introduction

Datatype Matrices

Source Guides

Data Movers

Target Guides

Running Wirekite

Tools

​Overview

​Handover: Data to CDC Transition

​Position Capture by Database

​Change Extraction

​Continuous Replication Loop

​File Rotation

​exitWhenIdle

​Transaction Consistency

​Crash Recovery

​Change Extractor Recovery

​Change Loader Recovery

​Monitoring Replication

​Progress Queries

​Measuring Lag

​Source-Side Lag Queries

​Configuration Reference

​Change Extractor Parameters

​Change Loader Parameters

Overview

Handover: Data to CDC Transition

Position Capture by Database

Change Extraction

Continuous Replication Loop

File Rotation

exitWhenIdle

Transaction Consistency

Crash Recovery

Change Extractor Recovery

Change Loader Recovery

Monitoring Replication

Progress Queries

Measuring Lag

Source-Side Lag Queries

Configuration Reference

Change Extractor Parameters

Change Loader Parameters