Skip to main content

Overview

Wirekite’s Change Data Capture (CDC) replication keeps a target database synchronized with ongoing changes to the source. The system extracts changes (inserts, updates, deletes) from the source database’s transaction log, writes them to intermediate .ckt files, and applies them to the target using merge tables. CDC can run independently or combined with a bulk data load. When combined, the orchestrator automatically handles the transition from data migration to continuous replication.

Handover: Data to CDC Transition

When data and change modes are combined in a single configuration, the orchestrator performs an automatic handover after the bulk data load completes. This captures the exact source position so that change replication begins without missing or duplicating any rows. The handover is a 4-step process:
1

Stop Replication

Pause replication on the source to freeze the position.
2

Capture Position

Record the current position in the source transaction log.
3

Start Change Processes

Launch the change extractor and loader with the captured position.
4

Resume Replication

Resume replication on the source. The change extractor now captures all new events from the saved position forward.

Position Capture by Database

Each database type uses a different mechanism to identify the current position in its transaction log:
DatabasePosition TypeCapture Method
MySQLBinlog file + byte offsetSHOW MASTER STATUS followed by FLUSH BINARY LOGS
PostgreSQLWAL Log Sequence Number (LSN)pg_current_wal_lsn() plus creation of a logical replication slot
OracleSystem Change Number (SCN)SELECT CURRENT_SCN FROM V$DATABASE followed by ALTER SYSTEM ARCHIVE LOG CURRENT
SQL ServerLog Sequence Number (LSN)sys.fn_cdc_get_max_lsn() via CDC, or HADR/transaction log fallback
When running data + change mode together, do not specify a starting position in the change source configuration. The orchestrator captures and injects the position automatically during handover.

Change Extraction

The change extractor connects to the source database’s transaction log and writes changes to .ckt files in the output directory.

Continuous Replication Loop

The extractor runs in a continuous loop:
  1. Wait for new events from the transaction log (with timeout)
  2. Process each event (write INSERT, UPDATE, or DELETE to the current .ckt file)
  3. When the operation count reaches the rotation threshold or a time-based flush triggers, rotate to a new file at the next COMMIT boundary
  4. Record the new file in the wirekite_progress table
  5. Repeat

File Rotation

The extractor rotates to a new .ckt file when either condition is met:
  • Operation count: After approximately 150,000 data operations (inserts, updates, deletes)
  • Time-based flush: Buffered data is flushed every few seconds to prevent loss on crash
Rotation always occurs at a transaction COMMIT boundary to ensure no partial transactions span files. The rotation process is atomic:
  1. Flush buffered data to the temporary file (N.ckt_)
  2. Close the file
  3. Rename N.ckt_ to N.ckt (atomic filesystem operation)
  4. Record the file in wirekite_progress
  5. Open the next file (N+1.ckt_)

exitWhenIdle

exitWhenIdle
boolean
default:"false"
When true, the change extractor exits after a period of inactivity (controlled by idleWaitSeconds). On exit, it creates a CHANGE.DONE marker file in the output directory. Useful for one-time sync scenarios where continuous replication is not needed.
When exitWhenIdle is false (the default), the extractor runs indefinitely until stopped via the wirekite_action table or process termination.

Transaction Consistency

Wirekite maintains transaction ordering through several mechanisms:
  • Single-threaded extraction: The change extractor processes the transaction log sequentially, preserving the original commit order
  • File rotation at COMMIT boundaries: Files never split a transaction, so each .ckt file contains only complete transactions
  • Ordered file processing: The change loader processes files in sequential numeric order (0.ckt, 1.ckt, 2.ckt, …)
  • Atomic merge operations: Each batch of changes is applied within a single transaction on the target
The processCommits parameter controls whether BEGIN (B) and COMMIT (C) markers are written to .ckt files. When enabled, the change loader can use these markers to understand transaction boundaries during application.

Crash Recovery

CDC replication recovers automatically from crashes using the wirekite_progress table.

Change Extractor Recovery

On restart, the change extractor:
  1. Queries wirekite_progress for the latest CE (Change Extract) record
  2. If the record has finish_time IS NULL, the last file was incomplete — resumes from the mark position (binlog position, LSN, or SCN)
  3. If a temporary file (.ckt_) exists, it is either renamed or discarded based on the recovery state
  4. Continues writing from the next file number

Change Loader Recovery

On restart, the change loader:
  1. Scans the input directory for .ckt files
  2. Queries wirekite_progress for the last processed CL (Change Load) record
  3. Skips files that have already been fully processed (finish_time IS NOT NULL)
  4. Resumes from the next unprocessed file
Each batch starts with a clean state, so partial batches from a previous crash are safely discarded and reprocessed.

Monitoring Replication

Progress Queries

Check the latest extractor position:
SELECT source, mark, start_time, finish_time
FROM wirekite.wirekite_progress
WHERE operation = 'CE'
ORDER BY start_time DESC
LIMIT 5;
Check the latest loader position:
SELECT source, mark, start_time, finish_time
FROM wirekite.wirekite_progress
WHERE operation = 'CL'
ORDER BY start_time DESC
LIMIT 5;

Measuring Lag

Replication lag is the difference between the current time and the timestamp of the latest change file processed by the loader:
SELECT TIMESTAMPDIFF(SECOND, MAX(finish_time), NOW()) AS lag_seconds
FROM wirekite.wirekite_progress
WHERE operation = 'CL' AND finish_time IS NOT NULL;
You can also monitor lag through the Web Interface, which displays real-time change lag on the replication tab.

Source-Side Lag Queries

Monitor the source database to understand how far behind extraction is: MySQL:
SHOW MASTER STATUS;
-- Compare current position with extractor's last recorded position
PostgreSQL:
SELECT slot_name,
       pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn) AS lag_bytes
FROM pg_replication_slots
WHERE slot_name = 'wirekite_replication_slot';

Configuration Reference

Key parameters for CDC replication. For the complete parameter list, see the individual Source Guides and Target Guides.

Change Extractor Parameters

ParameterTypeDefaultDescription
exitWhenIdlebooleanfalseExit after idle timeout instead of running continuously
idleWaitSecondsinteger2-3Seconds of inactivity before considering idle
flushIntervalSecondsinteger5Force flush buffered data every N seconds
eventsPerFlushinteger5000Flush after N events (MySQL)
processCommitsbooleanfalseInclude BEGIN/COMMIT markers in .ckt files
binlogFilestring-MySQL starting binlog filename (standalone change mode only)
binlogPositioninteger-MySQL starting byte offset (standalone change mode only)
startScnstring-Oracle starting SCN (standalone change mode only)
startLsnstring-SQL Server starting LSN (standalone change mode only)
replicationSlotstringwirekite_replication_slotPostgreSQL replication slot name

Change Loader Parameters

ParameterTypeDefaultDescription
maxFilesPerBatchinteger30Number of .ckt files per merge batch
maxThreadsinteger5Parallel threads for change loading
removeFilesbooleanfalseDelete .ckt files after successful loading
changeFileExtensionstringcktExtension for change files
workDirectorystring-Working directory for merge operations
doneDirectorystring-Directory to move completed files (alternative to removeFiles)