Implementation

Wirekite Components

Wirekite has three main database entities — schema, data and change — and three components — Extractor, Mover and Loader. The entities are the objects in the database that wirekite works upon, while components are the operators that work on the entities. Lets put these all together and we get a sequence in which all the components work on the entities to ultimately establish a target database which is continuously updated from the source database. The sequence that is to be followed in a real production infrastructure is as follows.

Schema

Schema Extractor

Extract the Schema of your source database. One Time.

Schema Mover

Move the Schema from source to target database. One Time.

Schema Loader

Load the schema on the target database. One Time.

Data

Data Extractor

Extract the Data of your source database. One Time.

Data Mover

Move the Data from source to target database. One Time.

Data Loader

Load the data on the target database. One Time.

Change

Change Extractor

Extract the Changes from your source database. Continuous.

Change Mover

Move the Change files from source to target database. Continuous.

Change Loader

Load the changes on the target database. Continuous.

Sequential Steps

The following architectural diagram how data will be migrated to a target database and how a replication pipeline will be established from a source to a target database instance.

Wirekite Implementation and Sequence

Design Points

These are some of the noteworthy points regarding the above diagram.

Wirekite operates using a replica of the customer-facing database instance. This is so we can avoid taking down the active production database instance, while having a quiesced database instance during the initial data extraction. The details of setting up a replica is database vendor-specific.
Wirekite components - Extractor, Mover, and Loader - can and should run in parallel. This minimizes the amount of time required to load an initial dump, as well as minimizing storage space needed for in-flight data files as they are fetched by the Extractor and loaded by the Loader. The shorter the initial dump/move/load time the shorter the lag when we switch from initial extraction mode to change data capture mode.
The performance characteristics - latency and throughput - of the connectivity between source and target as well as the storage database instances and their hosts should be appropriately scaled to the size and activity of the database.
The performance of the change data capture must be more than the rate of overall change events in the source database. For example if you are generating a 1 GB MySQL or MariaDB binlog in 10 seconds, you need to scale your endpoints and networking so that the contents of this binlog are extracted, moved, and loaded in significantly less than 10 seconds. If this is not the case, the target database will lag the source database “forever”. Specifically the following equation must hold true.

wirekite_extraction_time + wirekite_ship_time + wirekite_load_time < source_database_change_generation_time

Introduction

Datatype Matrices

Source Guides

Target Guides

Running Wirekite

Wirekite Database Commandline

Wirekite Components

Schema

Schema Extractor

Schema Mover

Schema Loader

Data

Data Extractor

Data Mover

Data Loader

Change

Change Extractor

Change Mover

Change Loader

Sequential Steps

Design Points

Introduction

Datatype Matrices

Source Guides

Target Guides

Running Wirekite

Wirekite Database Commandline

​Wirekite Components

​ Schema

Schema Extractor

Schema Mover

Schema Loader

​ Data

Data Extractor

Data Mover

Data Loader

​ Change

Change Extractor

Change Mover

Change Loader

​Sequential Steps

​Design Points

Wirekite Components

Schema

Data

Change

Sequential Steps

Design Points