> ## Documentation Index
> Fetch the complete documentation index at: https://docs.wirekite.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Cassandra

> This page explains how Apache Cassandra (CQL) source data types are mapped to each target database. Please visit the [CQL Data Types Documentation](https://cassandra.apache.org/doc/latest/cassandra/developing/cql/types.html) for a primer on the various Cassandra datatypes.

<Note>
  Cassandra is supported **as a source for [Target Sync](/run/targetsync) only** — there is no CDC path. See the [Cassandra Source Guide](/sourceguides/cassandra) for configuration.
</Note>

### Schema Mapping

Wirekite projects a CQL keyspace onto a fundamentally relational target. The structural mapping is:

| CQL concept                        | Target concept                                  |
| ---------------------------------- | ----------------------------------------------- |
| keyspace                           | schema (renamable via `schemaRename`)           |
| table                              | table                                           |
| partition key + clustering columns | composite `PRIMARY KEY` (in declaration order)  |
| regular column                     | nullable column                                 |
| static column                      | regular column (the static-ness is not modeled) |
| `frozen<…>`                        | transparent — the inner type is used            |

### Canonical Type Mapping

Wirekite first converts each CQL type to an internal Wirekite type, then each target's loader maps that to a concrete column type. The source of truth is `cassandra/db_utils.CQLToWirekite`.

| CQL type                                  | Wirekite type | Notes                                                                                |
| ----------------------------------------- | ------------- | ------------------------------------------------------------------------------------ |
| `text`, `varchar`, `ascii`                | `VARCHAR`     | length **synthesized** to 4000 — see [Synthesized Bounds](#synthesized-bounds)       |
| `tinyint`                                 | `INT8`        |                                                                                      |
| `smallint`                                | `INT16`       |                                                                                      |
| `int`                                     | `INT32`       |                                                                                      |
| `bigint`                                  | `INT64`       |                                                                                      |
| `counter`                                 | `INT64`       | read as its current value                                                            |
| `varint`                                  | `NUMERIC`     | arbitrary-precision integer — precision **synthesized** to `(38,0)`                  |
| `decimal`                                 | `NUMERIC`     | precision/scale **synthesized** to `(38,18)`                                         |
| `float`                                   | `FLOAT32`     |                                                                                      |
| `double`                                  | `FLOAT64`     |                                                                                      |
| `boolean`                                 | `BOOL`        |                                                                                      |
| `timestamp`                               | `TIMESTAMP`   | millisecond precision, UTC                                                           |
| `date`                                    | `DATE`        |                                                                                      |
| `time`                                    | `TIME`        | nanosecond-since-midnight rendered `HH:MM:SS.ffffff`                                 |
| `uuid`, `timeuuid`                        | `UUID`        | rendered lowercase for stable ordering                                               |
| `inet`                                    | `VARCHAR`     | IP rendered as text                                                                  |
| `duration`                                | `VARCHAR`     | ISO-8601-ish text                                                                    |
| `blob`                                    | `BLOB`        |                                                                                      |
| `list<…>`, `set<…>`, `map<…>`, `tuple<…>` | `JSON`        | serialized to JSON — see [how collections are encoded](#how-collections-are-encoded) |
| user-defined type (UDT), any unknown      | `JSON`        | serialized to JSON                                                                   |

### Mapping to Targets

The following tables show how each CQL type lands on every supported Target Sync target.

#### Integer Types

| Cassandra  | PostgreSQL | MySQL      | SQL Server | Oracle       | Firebolt  | Snowflake | Spanner | BigQuery | SingleStore | Databricks |
| ---------- | ---------- | ---------- | ---------- | ------------ | --------- | --------- | ------- | -------- | ----------- | ---------- |
| `tinyint`  | `smallint` | `TINYINT`  | `SMALLINT` | `NUMBER(38)` | `INTEGER` | `tinyint` | `INT64` | `INT64`  | `TINYINT`   | `TINYINT`  |
| `smallint` | `smallint` | `SMALLINT` | `SMALLINT` | `NUMBER(38)` | `INTEGER` | `number`  | `INT64` | `INT64`  | `SMALLINT`  | `SMALLINT` |
| `int`      | `integer`  | `INT`      | `INT`      | `NUMBER(38)` | `INTEGER` | `int`     | `INT64` | `INT64`  | `INT`       | `INT`      |
| `bigint`   | `bigint`   | `BIGINT`   | `BIGINT`   | `NUMBER(38)` | `BIGINT`  | `bigint`  | `INT64` | `INT64`  | `BIGINT`    | `BIGINT`   |
| `counter`  | `bigint`   | `BIGINT`   | `BIGINT`   | `NUMBER(38)` | `BIGINT`  | `bigint`  | `INT64` | `INT64`  | `BIGINT`    | `BIGINT`   |

#### Decimal Types

| Cassandra | PostgreSQL       | MySQL            | SQL Server       | Oracle          | Firebolt         | Snowflake       | Spanner   | BigQuery  | SingleStore      | Databricks       |
| --------- | ---------------- | ---------------- | ---------------- | --------------- | ---------------- | --------------- | --------- | --------- | ---------------- | ---------------- |
| `varint`  | `numeric(38,0)`  | `NUMERIC(38,0)`  | `NUMERIC(38,0)`  | `NUMBER(38,0)`  | `NUMERIC(38,0)`  | `number(38,0)`  | `NUMERIC` | `NUMERIC` | `NUMERIC(38,0)`  | `DECIMAL(38,0)`  |
| `decimal` | `numeric(38,18)` | `NUMERIC(38,18)` | `NUMERIC(38,18)` | `NUMBER(38,18)` | `NUMERIC(38,18)` | `number(38,18)` | `NUMERIC` | `NUMERIC` | `NUMERIC(38,18)` | `DECIMAL(38,18)` |

<Note>
  **Spanner and BigQuery use a fixed-scale `NUMERIC`.** Spanner `NUMERIC` is precision/scale `(38,9)` and BigQuery `NUMERIC` caps at `(38,9)`. A Cassandra `decimal` carries up to scale 18, so on these two targets the scale is clamped to 9. The validator compares decimal values **numerically** (not as strings), so padding/clamping on the other targets does not produce false diffs.
</Note>

#### Floating Point Types

| Cassandra | PostgreSQL         | MySQL    | SQL Server | Oracle          | Firebolt | Snowflake | Spanner   | BigQuery  | SingleStore | Databricks |
| --------- | ------------------ | -------- | ---------- | --------------- | -------- | --------- | --------- | --------- | ----------- | ---------- |
| `float`   | `real`             | `FLOAT`  | `REAL`     | `BINARY_FLOAT`  | `DOUBLE` | `float`   | `FLOAT32` | `FLOAT64` | `FLOAT`     | `FLOAT`    |
| `double`  | `double precision` | `DOUBLE` | `FLOAT`    | `BINARY_DOUBLE` | `DOUBLE` | `float`   | `FLOAT64` | `FLOAT64` | `DOUBLE`    | `DOUBLE`   |

#### Boolean Type

| Cassandra | PostgreSQL | MySQL        | SQL Server | Oracle    | Firebolt  | Snowflake | Spanner | BigQuery | SingleStore  | Databricks |
| --------- | ---------- | ------------ | ---------- | --------- | --------- | --------- | ------- | -------- | ------------ | ---------- |
| `boolean` | `boolean`  | `TINYINT(1)` | `BIT`      | `CHAR(1)` | `BOOLEAN` | `boolean` | `BOOL`  | `BOOL`   | `TINYINT(1)` | `BOOLEAN`  |

#### Date and Time Types

| Cassandra   | PostgreSQL                    | MySQL         | SQL Server  | Oracle         | Firebolt    | Snowflake       | Spanner     | BigQuery   | SingleStore   | Databricks      |
| ----------- | ----------------------------- | ------------- | ----------- | -------------- | ----------- | --------------- | ----------- | ---------- | ------------- | --------------- |
| `timestamp` | `timestamp without time zone` | `DATETIME(6)` | `DATETIME2` | `TIMESTAMP`    | `TIMESTAMP` | `timestamp_ntz` | `TIMESTAMP` | `DATETIME` | `DATETIME(6)` | `TIMESTAMP_NTZ` |
| `date`      | `date`                        | `DATE`        | `DATE`      | `DATE`         | `DATE`      | `date`          | `DATE`      | `DATE`     | `DATE`        | `DATE`          |
| `time`      | `time without time zone`      | `TIME(6)`     | `TIME`      | `VARCHAR2(64)` | `TEXT`      | `time`          | `STRING`    | `TIME`     | `TIME(6)`     | `STRING`        |

#### Character, Network & UUID Types

| Cassandra                    | PostgreSQL                | MySQL           | SQL Server         | Oracle           | Firebolt | Snowflake       | Spanner        | BigQuery | SingleStore     | Databricks |
| ---------------------------- | ------------------------- | --------------- | ------------------ | ---------------- | -------- | --------------- | -------------- | -------- | --------------- | ---------- |
| `text` / `varchar` / `ascii` | `character varying(4000)` | `VARCHAR(4000)` | `VARCHAR(4000)`    | `VARCHAR2(4000)` | `TEXT`   | `varchar(4000)` | `STRING(4000)` | `STRING` | `VARCHAR(4000)` | `STRING`   |
| `inet`                       | `character varying(4000)` | `VARCHAR(4000)` | `VARCHAR(4000)`    | `VARCHAR2(4000)` | `TEXT`   | `varchar(4000)` | `STRING(4000)` | `STRING` | `VARCHAR(4000)` | `STRING`   |
| `duration`                   | `character varying(4000)` | `VARCHAR(4000)` | `VARCHAR(4000)`    | `VARCHAR2(4000)` | `TEXT`   | `varchar(4000)` | `STRING(4000)` | `STRING` | `VARCHAR(4000)` | `STRING`   |
| `uuid` / `timeuuid`          | `uuid`                    | `VARCHAR(64)`   | `UNIQUEIDENTIFIER` | `VARCHAR2(128)`  | `TEXT`   | `varchar(64)`   | `STRING`       | `STRING` | `VARCHAR(64)`   | `STRING`   |

#### Binary Type

| Cassandra | PostgreSQL | MySQL  | SQL Server       | Oracle | Firebolt | Snowflake | Spanner | BigQuery | SingleStore | Databricks |
| --------- | ---------- | ------ | ---------------- | ------ | -------- | --------- | ------- | -------- | ----------- | ---------- |
| `blob`    | `bytea`    | `BLOB` | `VARBINARY(MAX)` | `BLOB` | `BYTEA`  | `binary`  | `BYTES` | `BYTES`  | `BLOB`      | `BINARY`   |

#### Collection & Complex Types

| Cassandra                              | PostgreSQL | MySQL  | SQL Server     | Oracle | Firebolt | Snowflake | Spanner | BigQuery | SingleStore | Databricks |
| -------------------------------------- | ---------- | ------ | -------------- | ------ | -------- | --------- | ------- | -------- | ----------- | ---------- |
| `list` / `set` / `map` / `tuple` / UDT | `jsonb`    | `JSON` | `VARCHAR(MAX)` | `CLOB` | `TEXT`   | `varchar` | `JSON`  | `JSON`   | `JSON`      | `STRING`   |

<Note>
  **MariaDB** maps identically to MySQL, and **TigerData** identically to PostgreSQL. **MongoDB** is a schemaless target — scalar types land as native BSON values and JSON columns become sub-documents, so it is not shown as a column above.
</Note>

### Synthesized Bounds

Cassandra has **no column lengths, precisions, or scales** — `text` is unbounded and `decimal`/`varint` are arbitrary-precision. Relational targets, however, want bounded, precise column types, and some bulk loaders reject unbounded ones outright. So the schema extractor synthesizes sensible bounds:

* **`varint` → `NUMERIC(38,0)`** and **`decimal` → `NUMERIC(38,18)`.** Without an explicit precision, Databricks defaults `NUMERIC` to `DECIMAL(10,0)` and overflows on a 19-digit varint. 38 is the maximum Databricks/Oracle decimal precision; scale 0 keeps varint an exact integer, and scale 18 leaves room for a fraction. Targets that prefer arbitrary precision (PostgreSQL, Firebolt) simply pad the scale.

* **`text` / `varchar` / `ascii` / `inet` / `duration` → length 4000.** A length-less `VARCHAR` becomes `VARCHAR(MAX)` on SQL Server (whose BCP loader cannot bind it) and `CLOB` on Oracle (whose OCI loader cannot bulk-load it). 4000 is Oracle's `VARCHAR2` standard maximum and within SQL Server's 8000 non-MAX cap.

<Warning>
  The fixed 4000-character bound is **lossy for genuinely large Cassandra `text`.** It is ample for typical keys, labels, and short documents; tables that need larger text should raise this bound.
</Warning>

### How Collections Are Encoded

`list`, `set`, `map`, `tuple`, and user-defined types are serialized to a single **JSON** column on the target. Wirekite does **not** explode collections into child tables, normalize maps, or model UDT fields — the relational skeleton (primary key + scalar columns) stays queryable, and the complex values ride along as JSON.

To keep the diff stable across runs, the JSON is canonicalized:

* `list` and `tuple` → JSON array, **order preserved**.
* `set` → JSON array, **sorted** (sets are unordered in Cassandra).
* `map` → JSON object with **keys sorted** and coerced to strings.
* UDT → JSON object, fields in **declared order**.

<Note>
  Complex columns **cannot be part of a primary key**, and `frozen<…>` wrappers are transparent (the inner type is used).
</Note>
