> ## Documentation Index
> Fetch the complete documentation index at: https://docs.wirekite.io/llms.txt
> Use this file to discover all available pages before exploring further.

# CDC Replication

> How Wirekite captures and applies changes for continuous database replication.

## Overview

Wirekite's Change Data Capture (CDC) replication keeps a target database synchronized with ongoing changes to the source. The system extracts changes (inserts, updates, deletes) from the source database's transaction log, writes them to intermediate `.ckt` files, and applies them to the target using merge tables.

CDC can run independently or combined with a bulk data load. When combined, the orchestrator automatically handles the transition from data migration to continuous replication.

## Handover: Data to CDC Transition

When data and change modes are combined in a single configuration, the orchestrator performs an automatic handover after the bulk data load completes. This captures the exact source position so that change replication begins without missing or duplicating any rows.

The handover is a 4-step process:

<Steps>
  <Step title="Stop Replication">
    Pause replication on the source to freeze the position.
  </Step>

  <Step title="Capture Position">
    Record the current position in the source transaction log.
  </Step>

  <Step title="Start Change Processes">
    Launch the change extractor and loader with the captured position.
  </Step>

  <Step title="Resume Replication">
    Resume replication on the source. The change extractor now captures all new events from the saved position forward.
  </Step>
</Steps>

### Position Capture by Database

Each database type uses a different mechanism to identify the current position in its transaction log:

| Database   | Position Type                 | Capture Method                                                                      |
| ---------- | ----------------------------- | ----------------------------------------------------------------------------------- |
| MySQL      | Binlog file + byte offset     | `SHOW MASTER STATUS` followed by `FLUSH BINARY LOGS`                                |
| PostgreSQL | WAL Log Sequence Number (LSN) | `pg_current_wal_lsn()` plus creation of a logical replication slot                  |
| Oracle     | System Change Number (SCN)    | `SELECT CURRENT_SCN FROM V$DATABASE` followed by `ALTER SYSTEM ARCHIVE LOG CURRENT` |
| SQL Server | Log Sequence Number (LSN)     | `sys.fn_cdc_get_max_lsn()` via CDC, or HADR/transaction log fallback                |

<Note>
  When running data + change mode together, do not specify a starting position in the change source configuration. The orchestrator captures and injects the position automatically during handover.
</Note>

## Change Extraction

The change extractor connects to the source database's transaction log and writes changes to `.ckt` files in the output directory.

### Continuous Replication Loop

The extractor runs in a continuous loop:

1. Wait for new events from the transaction log (with timeout)
2. Process each event (write INSERT, UPDATE, or DELETE to the current `.ckt` file)
3. When the operation count reaches the rotation threshold or a time-based flush triggers, rotate to a new file at the next COMMIT boundary
4. Record the new file in the `wirekite_progress` table
5. Repeat

### File Rotation

The extractor rotates to a new `.ckt` file when either condition is met:

* **Operation count**: PostgreSQL rotates after approximately 150,000 data operations. MySQL rotates based on `flushIntervalSeconds` (time-based) or binlog rotation. Oracle and SQL Server rotate per archive log or LSN range.
* **Time-based flush**: Buffered data is flushed every few seconds to prevent loss on crash

Rotation always occurs at a transaction COMMIT boundary to ensure no partial transactions span files. The rotation process is atomic:

1. Flush buffered data to the temporary file (`N.ckt_`)
2. Close the file
3. Rename `N.ckt_` to `N.ckt` (atomic filesystem operation)
4. Record the file in `wirekite_progress`
5. Open the next file (`N+1.ckt_`)

### exitWhenIdle

<ResponseField name="exitWhenIdle" type="boolean" default="false">
  When true, the change extractor exits after a period of inactivity (controlled by `idleWaitSeconds`). On exit, it creates a `CHANGE.DONE` marker file in the output directory. Useful for one-time sync scenarios where continuous replication is not needed.
</ResponseField>

When `exitWhenIdle` is false (the default), the extractor runs indefinitely until stopped via the [wirekite\_action](/run/operations) table or process termination.

## Transaction Consistency

Wirekite maintains transaction ordering through several mechanisms:

* **Single-threaded extraction**: The change extractor processes the transaction log sequentially, preserving the original commit order
* **File rotation at COMMIT boundaries**: Files never split a transaction, so each `.ckt` file contains only complete transactions
* **Ordered file processing**: The change loader processes files in sequential numeric order (0.ckt, 1.ckt, 2.ckt, ...)
* **Multi-threaded merge**: Within each batch, the change loader can apply changes to multiple tables concurrently using merge threads (controlled by `maxMergeThreads`), while still preserving per-table ordering
* **Atomic merge operations**: Each batch of changes is applied within a single transaction on the target

The `processCommits` parameter controls whether BEGIN (`B`) and COMMIT (`C`) markers are written to `.ckt` files. When enabled, the change loader can use these markers to understand transaction boundaries during application.

## Crash Recovery

CDC replication recovers automatically from crashes using the `wirekite_progress` table.

### Change Extractor Recovery

On restart, the change extractor:

1. Queries `wirekite_progress` for the latest `CE` (Change Extract) record
2. If the record has `finish_time IS NULL`, the last file was incomplete -- resumes from the `mark` position (binlog position, LSN, or SCN)
3. If a temporary file (`.ckt_`) exists, it is either renamed or discarded based on the recovery state
4. Continues writing from the next file number

### Change Loader Recovery

On restart, the change loader:

1. Scans the input directory for `.ckt` files
2. Queries `wirekite_progress` for the last processed `C` (Change Load) record
3. Skips files that have already been fully processed (`finish_time IS NOT NULL`)
4. Resumes from the next unprocessed file

Each batch starts with a clean state, so partial batches from a previous crash are safely discarded and reprocessed.

## Monitoring Replication

### Progress Queries

Check the latest extractor position:

```sql theme={null}
SELECT source, mark, start_time, finish_time
FROM wirekite.wirekite_progress
WHERE operation = 'CE'
ORDER BY start_time DESC
LIMIT 5;
```

Check the latest loader position:

```sql theme={null}
SELECT source, mark, start_time, finish_time
FROM wirekite.wirekite_progress
WHERE operation = 'C'
ORDER BY start_time DESC
LIMIT 5;
```

### Measuring Lag

Replication lag is the difference between the current time and the timestamp of the latest change file processed by the loader:

```sql theme={null}
SELECT TIMESTAMPDIFF(SECOND, MAX(finish_time), NOW()) AS lag_seconds
FROM wirekite.wirekite_progress
WHERE operation = 'C' AND finish_time IS NOT NULL;
```

You can also monitor lag through the [Web Interface](/run/execution/ux), which displays real-time change lag on the replication tab.

### Source-Side Lag Queries

Monitor the source database to understand how far behind extraction is:

**MySQL:**

```sql theme={null}
SHOW MASTER STATUS;
-- Compare current position with extractor's last recorded position
```

**PostgreSQL:**

```sql theme={null}
SELECT slot_name,
       pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn) AS lag_bytes
FROM pg_replication_slots
WHERE slot_name = 'wirekite_replication_slot';
```

## Configuration Reference

Key parameters for CDC replication. For the complete parameter list, see the individual [Source Guides](/sourceguides/mysql) and [Target Guides](/targetguides/snowflake).

### Change Extractor Parameters

| Parameter              | Type    | Default | Description                                                                                                      |
| ---------------------- | ------- | ------- | ---------------------------------------------------------------------------------------------------------------- |
| `exitWhenIdle`         | boolean | false   | Exit after idle timeout instead of running continuously                                                          |
| `idleWaitSeconds`      | integer | varies  | Seconds of inactivity before considering idle. Defaults: MySQL=2, PostgreSQL=3, SQL Server=10, Oracle=30         |
| `flushIntervalSeconds` | integer | 5       | Force flush buffered data every N seconds (MySQL and Oracle only)                                                |
| `eventsPerFlush`       | integer | 5000    | Flush after N events (MySQL and Oracle only). PostgreSQL uses a fixed rotation threshold of \~150,000 operations |
| `processCommits`       | boolean | false   | Include BEGIN/COMMIT markers in .ckt files                                                                       |
| `binlogFile`           | string  | -       | MySQL starting binlog filename (standalone change mode only)                                                     |
| `binlogPosition`       | integer | -       | MySQL starting byte offset (standalone change mode only)                                                         |
| `startScn`             | string  | -       | Oracle starting SCN (standalone change mode only)                                                                |
| `startLsn`             | string  | -       | SQL Server starting LSN (standalone change mode only)                                                            |
| `replicationSlot`      | string  | -       | PostgreSQL replication slot name (required for PostgreSQL CDC)                                                   |
| `lsnBatchSize`         | integer | 1000000 | Number of LSNs per batch (SQL Server only)                                                                       |

### Change Loader Parameters

| Parameter             | Type    | Default       | Description                                                                                                                                                                                                                |
| --------------------- | ------- | ------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `maxFilesPerBatch`    | integer | 60            | Number of .ckt files per merge batch (30 for Spanner and SingleStore)                                                                                                                                                      |
| `maxMergeThreads`     | integer | 2 x CPU cores | Number of parallel threads for applying merge operations within each batch. Applies to Snowflake, BigQuery, Firebolt, and Databricks targets. MySQL, PostgreSQL, Oracle, and SQL Server targets use a fixed maximum of 10. |
| `removeFiles`         | boolean | false         | Delete .ckt files after successful loading                                                                                                                                                                                 |
| `changeFileExtension` | string  | ckt           | Extension for change files                                                                                                                                                                                                 |
| `workDirectory`       | string  | -             | Working directory for merge operations                                                                                                                                                                                     |
| `doneDirectory`       | string  | -             | Directory to move completed files (alternative to removeFiles)                                                                                                                                                             |
