Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 68 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2033,7 +2033,7 @@ $result = $schema->table('events')

ClickHouse uses `Nullable(type)` wrapping for nullable columns, `Enum8(...)` for enums, `Tuple(Float64, Float64)` for points, and `TYPE minmax GRANULARITY 3` for indexes. Foreign keys, stored procedures, triggers, generated columns, and CHECK constraints throw `UnsupportedException`.

Supports the `TableComments`, `ColumnComments`, and `DropPartition` interfaces.
Supports the `TableComments`, `ColumnComments`, `DropPartition`, `Views`, and `Databases` interfaces.

**Engine selection** — choose from 10 variants of the `Engine` enum:

Expand Down Expand Up @@ -2130,6 +2130,73 @@ $schema->table('events')

Setting names must match `[A-Za-z_][A-Za-z0-9_]*`; string values are restricted to `[A-Za-z0-9_.\-+/]*`. Use ints / floats / booleans for everything else. Other dialects ignore the call.

**LowCardinality** — wrap a column type in `LowCardinality(...)` for compact dictionary-encoded storage on string columns with a small number of distinct values (status enums, type discriminators, country codes, category labels):

```php
$schema->table('events')
->bigInteger('id')->primary()
->string('status')->lowCardinality()
->string('country')->lowCardinality()->nullable()
->create();

// CREATE TABLE `events` (`id` Int64, `status` LowCardinality(String),
// `country` Nullable(LowCardinality(String))) ENGINE = MergeTree() ORDER BY (`id`)
```

`Nullable` is applied outside `LowCardinality` to match ClickHouse's required wrapping order. The `lowCardinality()` method is only available on the ClickHouse builder — callers on other dialects (`MySQL`, `PostgreSQL`, `SQLite`, `MongoDB`) cannot reach this method at all.

**FixedString(N)** — fixed-length string column. Use for ISO codes, hash digests, and other values whose byte length is known and constant:

```php
$schema->table('locations')
->bigInteger('id')->primary()
->fixedString('country_code', 2) // ISO 3166-1 alpha-2
->fixedString('currency_code', 3) // ISO 4217
->fixedString('digest', 32) // raw MD5
->create();

// CREATE TABLE `locations` (`id` Int64, `country_code` FixedString(2),
// `currency_code` FixedString(3), `digest` FixedString(32))
// ENGINE = MergeTree() ORDER BY (`id`)
```

Length must be at least 1. The `fixedString()` method is only available on the ClickHouse builder — the type has no portable mapping.

**Column-level CODEC** — append one or more compression codecs to a column. Multiple `codec()` calls accumulate and emit `CODEC(c1, c2, ...)`:

```php
$schema->table('metrics')
->bigInteger('id')->primary()
->datetime('ts', 3)->codec('Delta(4)')->codec('LZ4') // monotonic timestamps
->bigInteger('value')->codec('T64')->codec('LZ4') // integer column
->string('payload')->codec('ZSTD(3)') // text column
->create();

// CREATE TABLE `metrics` (`id` Int64,
// `ts` DateTime64(3) CODEC(Delta(4), LZ4),
// `value` Int64 CODEC(T64, LZ4),
// `payload` String CODEC(ZSTD(3))) ENGINE = MergeTree() ORDER BY (`id`)
```

Each codec string is emitted verbatim; supply codec arguments inline (`'Delta(4)'`, `'ZSTD(3)'`). Codec strings must not be empty or contain a semicolon. The `codec()` method is only available on the ClickHouse builder.

**SAMPLE BY** — declare a sampling expression for approximate-query support (`SELECT ... SAMPLE k`). Emitted after `ORDER BY` and before `TTL` / `SETTINGS`:

```php
$schema->table('events')
->bigInteger('id')->primary()
->bigInteger('user_id')->unsigned()
->sampleBy('user_id')
->create();

// CREATE TABLE `events` (`id` Int64, `user_id` UInt64) ENGINE = MergeTree()
// ORDER BY (`id`) SAMPLE BY user_id
```

The expression is emitted verbatim and must not be empty or contain a semicolon. `SAMPLE BY` only applies to engines that take an `ORDER BY` clause (the MergeTree family); using it with `Memory`, `Log`, `TinyLog`, or `StripeLog` throws `UnsupportedException`. The `sampleBy()` method is only available on the ClickHouse builder.

These OLAP-shaped modifiers live on the ClickHouse-specific `Column\ClickHouse` and `Table\ClickHouse` builders. Because the methods only exist on the dialect's own builder subclasses, calling `->lowCardinality()` or `->sampleBy()` on a `MySQL`, `PostgreSQL`, `SQLite`, or `MongoDB` builder fails at the type level, with no runtime branch needed.

### SQLite Schema

```php
Expand Down
31 changes: 31 additions & 0 deletions src/Query/Schema/ClickHouse.php
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,20 @@ protected function compileColumnType(Column $column): string
throw new UnsupportedException('User-defined types are not supported in ClickHouse.');
}

if ($column instanceof Column\ClickHouse && $column->isFixedString()) {
$type = 'FixedString(' . $column->fixedStringLength . ')';

if ($column->isLowCardinality) {
$type = 'LowCardinality(' . $type . ')';
}

if ($column->isNullable) {
$type = 'Nullable(' . $type . ')';
}

return $type;
}

$type = match ($column->type) {
ColumnType::String, ColumnType::Varchar, ColumnType::Relationship => 'String',
ColumnType::Text => 'String',
Expand All @@ -53,6 +67,10 @@ protected function compileColumnType(Column $column): string
ColumnType::Serial, ColumnType::BigSerial, ColumnType::SmallSerial => throw new UnsupportedException('SERIAL types are not supported in ClickHouse.'),
};

if ($column instanceof Column\ClickHouse && $column->isLowCardinality) {
$type = 'LowCardinality(' . $type . ')';
}

if ($column->isNullable) {
$type = 'Nullable(' . $type . ')';
}
Expand Down Expand Up @@ -89,6 +107,10 @@ protected function compileColumnDefinition(Column $column): string
$parts[] = 'DEFAULT ' . $this->compileDefaultValue($column->default);
}

if ($column instanceof Column\ClickHouse && $column->codecs !== []) {
$parts[] = 'CODEC(' . \implode(', ', $column->codecs) . ')';
}

if ($column->ttl !== null) {
$parts[] = 'TTL ' . $column->ttl;
}
Expand Down Expand Up @@ -226,6 +248,15 @@ public function compileCreate(Table $table, bool $ifNotExists = false): Statemen
: ' ORDER BY tuple()';
}

if ($table instanceof Table\ClickHouse && $table->sampleBy !== null) {
if (! $engine->requiresOrderBy()) {
throw new UnsupportedException(
'SAMPLE BY is only supported on engines that take an ORDER BY clause.'
);
}
$sql .= ' SAMPLE BY ' . $table->sampleBy;
}

if ($table->ttl !== null) {
$sql .= ' TTL ' . $table->ttl;
}
Expand Down
78 changes: 78 additions & 0 deletions src/Query/Schema/Column/ClickHouse.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

namespace Utopia\Query\Schema\Column;

use Utopia\Query\Exception\ValidationException;
use Utopia\Query\Schema\Column;
use Utopia\Query\Schema\Forwarder;
use Utopia\Query\Schema\Table;
Expand All @@ -13,6 +14,40 @@ class ClickHouse extends Column
{
use Forwarder\ClickHouse;

public protected(set) bool $isLowCardinality = false;

/** Length when the column should be emitted as `FixedString(N)`; null otherwise. */
public protected(set) ?int $fixedStringLength = null;

/** @var list<string> Column-level CODEC clauses, e.g. ['Delta(4)', 'LZ4'] */
public protected(set) array $codecs = [];

/**
* Mark the column as `FixedString(N)`.
*
* Used by {@see Table\ClickHouse::fixedString()} to attach the
* ClickHouse-specific FixedString width to a column whose generic
* {@see \Utopia\Query\Schema\ColumnType} is `String`. The compiler reads
* this state when emitting DDL.
*
* @throws ValidationException if $length is less than 1.
*/
public function asFixedString(int $length): static
{
if ($length < 1) {
throw new ValidationException('FixedString length must be at least 1.');
}

$this->fixedStringLength = $length;

return $this;
}

public function isFixedString(): bool
{
return $this->fixedStringLength !== null;
}

/**
* @param list<string> $columns
*
Expand All @@ -28,4 +63,47 @@ public function primary(array $columns = []): static|Table

return $this->table->primary($columns);
}

/**
* Wrap the column type in `LowCardinality(...)`.
*
* Suitable for string columns with a small number of distinct values
* (status enums, type discriminators, country codes). `Nullable` is
* applied outside `LowCardinality` to match ClickHouse's required
* wrapping order: `Nullable(LowCardinality(String))`.
*/
public function lowCardinality(): static
{
$this->isLowCardinality = true;

return $this;
}

/**
* Append a column-level CODEC clause.
*
* Multiple calls accumulate and emit `CODEC(c1, c2, ...)`. Pass either
* a bare codec name (`->codec('LZ4')`) or one with arguments
* (`->codec('Delta(4)')`, `->codec('ZSTD(3)')`). The codec string is
* emitted verbatim and must come from a trusted source.
*
* @throws ValidationException if the codec string is empty or contains
* a semicolon.
*/
public function codec(string $codec): static
{
$trimmed = \trim($codec);

if ($trimmed === '') {
throw new ValidationException('CODEC expression must not be empty.');
}

if (\str_contains($trimmed, ';')) {
throw new ValidationException('CODEC expression must not contain ";".');
}

$this->codecs[] = $trimmed;

return $this;
}
}
9 changes: 9 additions & 0 deletions src/Query/Schema/Forwarder/ClickHouse.php
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,11 @@ public function vector(string $name, int $dimensions): Column\ClickHouse
return $this->table->vector($name, $dimensions);
}

public function fixedString(string $name, int $length): Column\ClickHouse
{
return $this->table->fixedString($name, $length);
}

public function engine(Engine $engine, string ...$args): Table\ClickHouse
{
return $this->table->engine($engine, ...$args);
Expand All @@ -44,4 +49,8 @@ public function partitionBy(string $expression): Table\ClickHouse
return $this->table->partitionBy($expression);
}

public function sampleBy(string $expression): Table\ClickHouse
{
return $this->table->sampleBy($expression);
}
}
51 changes: 51 additions & 0 deletions src/Query/Schema/Table/ClickHouse.php
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ class ClickHouse extends Table
{
use Trait\CompositePrimary;

/** ClickHouse SAMPLE BY expression. Emitted after ORDER BY when set. */
public protected(set) ?string $sampleBy = null;

#[\Override]
protected function newColumn(string $name, ColumnType $type, ?int $length = null, ?int $precision = null): Column\ClickHouse
{
Expand All @@ -31,6 +34,30 @@ public function vector(string $name, int $dimensions): Column\ClickHouse
return $col;
}

/**
* Add a `FixedString(N)` column.
*
* Used for fixed-length string values whose byte length is known and
* constant — ISO 3166 country codes, ISO 4217 currency codes, hash
* digests, and similar values that benefit from ClickHouse's columnar
* storage of fixed-width data.
*
* The column is registered with the generic `ColumnType::String` type and
* tagged with FixedString state on {@see Column\ClickHouse}; the compiler
* reads that state when emitting DDL, so the global `ColumnType` enum
* stays free of ClickHouse-only cases.
*
* @throws ValidationException if $length is less than 1.
*/
public function fixedString(string $name, int $length): Column\ClickHouse
{
$col = $this->newColumn($name, ColumnType::String, $length);
$col->asFixedString($length);
$this->columns[] = $col;

return $col;
}

/**
* Select the table engine. Engine-specific arguments are validated against
* the engine variant:
Expand Down Expand Up @@ -158,4 +185,28 @@ public function partitionBy(string $expression): static

return $this;
}

/**
* Set the SAMPLE BY expression. Emitted after ORDER BY at table creation
* time. Required to model tables that need approximate-query support via
* `SELECT ... SAMPLE k` on MergeTree-family engines.
*
* @throws ValidationException if the expression is empty or contains a semicolon.
*/
public function sampleBy(string $expression): static
{
$trimmed = \trim($expression);

if ($trimmed === '') {
throw new ValidationException('SAMPLE BY expression must not be empty.');
}

if (\str_contains($trimmed, ';')) {
throw new ValidationException('SAMPLE BY expression must not contain ";".');
}

$this->sampleBy = $trimmed;

return $this;
}
}
Loading
Loading