Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
103 changes: 100 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1774,9 +1774,11 @@ $result = $schema->table('users')
->createIfNotExists();
```

Available column types: `id`, `string`, `text`, `mediumText`, `longText`, `integer`, `bigInteger`, `serial`, `bigSerial`, `smallSerial`, `float`, `boolean`, `datetime`, `timestamp`, `json`, `binary`, `enum`, `point`, `linestring`, `polygon`, `vector` (PostgreSQL only), `timestamps`.
Available column types: `id`, `uuid`, `string`, `text`, `mediumText`, `longText`, `tinyInteger`, `smallInteger`, `integer`, `bigInteger`, `serial`, `bigSerial`, `smallSerial`, `float`, `decimal`, `boolean`, `datetime`, `timestamp`, `json`, `binary`, `enum`, `point`, `linestring`, `polygon`, `vector` (PostgreSQL only), `timestamps`.

Column modifiers: `nullable()`, `default($value)`, `unsigned()`, `unique()`, `primary()`, `autoIncrement()`, `after($column)`, `comment($text)`, `collation($collation)`, `check($expression)`, `generatedAs($expression)` + `stored()` / `virtual()`, `ttl($expression)` (ClickHouse), `userType($name)` (PostgreSQL).
Column modifiers: `nullable()`, `default($value)`, `defaultRaw($expression)`, `unsigned()`, `unique()`, `primary()`, `autoIncrement()`, `after($column)`, `comment($text)`, `collation($collation)`, `check($expression)`, `generatedAs($expression)` + `stored()` / `virtual()`, `ttl($expression)` (ClickHouse), `userType($name)` (PostgreSQL).

**Raw default expressions** — use `defaultRaw($expression)` for dialect-specific server-generated defaults that `default()` would otherwise quote as a string literal (`now()`, `CURRENT_TIMESTAMP`, `gen_random_uuid()`, `generateUUIDv4()`, `UUID()`, …). The expression is emitted verbatim and must come from a trusted source; it must not be empty or contain a semicolon. Takes precedence over `default()` when both are set.

**SERIAL types** — auto-incrementing integers. PostgreSQL emits native `SERIAL` / `BIGSERIAL` / `SMALLSERIAL`; MySQL/MariaDB compile to `INT AUTO_INCREMENT` / `BIGINT AUTO_INCREMENT` / `SMALLINT AUTO_INCREMENT`; SQLite maps to `INTEGER`. ClickHouse and MongoDB throw `UnsupportedException`:

Expand Down Expand Up @@ -2195,7 +2197,102 @@ $schema->table('events')

The expression is emitted verbatim and must not be empty or contain a semicolon. `SAMPLE BY` only applies to engines that take an `ORDER BY` clause (the MergeTree family); using it with `Memory`, `Log`, `TinyLog`, or `StripeLog` throws `UnsupportedException`. The `sampleBy()` method is only available on the ClickHouse builder.

These OLAP-shaped modifiers live on the ClickHouse-specific `Column\ClickHouse` and `Table\ClickHouse` builders. Because the methods only exist on the dialect's own builder subclasses, calling `->lowCardinality()` or `->sampleBy()` on a `MySQL`, `PostgreSQL`, `SQLite`, or `MongoDB` builder fails at the type level, with no runtime branch needed.
**`UInt8` / `Int8` via `tinyInteger()` and `UInt16` / `Int16` via `smallInteger()`** — small integer columns are useful for bounded enumerations, percentage values, scroll depth, and similar fields where the value range fits well below 32 bits. Storing them as `UInt8` saves 75% of the disk and memory footprint compared to the default `UInt32` produced by `integer()->unsigned()`:

```php
$schema->table('events')
->bigInteger('id')->primary()
->tinyInteger('scroll_depth')->unsigned() // 0–100 percentage
->smallInteger('year_offset') // signed, fits years from epoch
->create();

// CREATE TABLE `events` (`id` Int64, `scroll_depth` UInt8, `year_offset` Int16)
// ENGINE = MergeTree() ORDER BY (`id`)
```

`tinyInteger()` and `smallInteger()` are on the base builder, so the same calls map to `TINYINT` / `SMALLINT` on MySQL, `SMALLINT` on PostgreSQL (both shapes — PostgreSQL has no `TINYINT`), and `INTEGER` on SQLite.

**`Array(T)` and `Tuple(...)` column types** — model multi-valued attributes (tags, labels, parallel-array nested records) and fixed-arity composites (geo points, key/value pairs) directly on the builder:

```php
use Utopia\Query\Schema\ColumnType;

$schema->table('events')
->bigInteger('id')->primary()
->array('meta.key', ColumnType::String)
->array('meta.value', ColumnType::String)
->array('user_ids', ColumnType::BigInteger)->unsigned()
->tuple('coords', [ColumnType::Float, ColumnType::Float])
->array('scores', ColumnType::String)->nullable()
->create();

// CREATE TABLE `events` (`id` Int64,
// `meta.key` Array(String), `meta.value` Array(String),
// `user_ids` Array(UInt64),
// `coords` Tuple(Float64, Float64),
// `scores` Nullable(Array(String))) ENGINE = MergeTree() ORDER BY (`id`)
```

The element type runs back through the standard column-type compiler, so the parent column's `unsigned()` and `precision` flags carry through to the inner type. `Nullable(...)` wraps the whole `Array`/`Tuple`; `LowCardinality(...)` is rejected on these columns because ClickHouse only permits it on scalar types. Both methods are only available on the ClickHouse builder.

**`decimal(precision, scale)`** — fixed-point numeric column for monetary or precision-sensitive values where binary floating-point error is unacceptable:

```php
$schema->table('orders')
->bigInteger('id')->primary()
->decimal('amount', precision: 18, scale: 3)
->decimal('rate', precision: 5, scale: 4)->nullable()
->create();

// CREATE TABLE `orders` (`id` Int64,
// `amount` Decimal(18, 3),
// `rate` Nullable(Decimal(5, 4))) ENGINE = MergeTree() ORDER BY (`id`)
```

`decimal()` is on the base builder: ClickHouse emits `Decimal(P, S)`, MySQL and PostgreSQL emit `DECIMAL(P, S)`, SQLite emits `NUMERIC(P, S)`, and MongoDB maps to the `decimal` BSON type. Scale must not be negative or exceed precision.

**`UUID` column type with `defaultRaw()`** — UUIDs are a first-class, fixed-width identifier type in ClickHouse and PostgreSQL, and a 36-character string elsewhere. Pair with `defaultRaw()` to attach a server-generated default expression that the standard `default()` would otherwise quote as a literal:

```php
$schema->table('events')
->uuid('event_id')->defaultRaw('generateUUIDv4()')->primary()
->datetime('ts', 3)
->create();

// CREATE TABLE `events` (`event_id` UUID DEFAULT generateUUIDv4(), `ts` DateTime64(3))
// ENGINE = MergeTree() ORDER BY (`event_id`)
```

`uuid()` compiles to the native `UUID` type on ClickHouse and PostgreSQL, `CHAR(36)` on MySQL, `TEXT` on SQLite, and the `string` BSON type on MongoDB. `defaultRaw(string)` is on the base `Column` and emits the expression verbatim — use for `generateUUIDv4()` (ClickHouse), `gen_random_uuid()` (PostgreSQL), `UUID()` (MySQL), `now()`, `CURRENT_TIMESTAMP`, and similar dialect-specific server-generated defaults. The expression must come from a trusted source; it must not be empty or contain a semicolon. `defaultRaw()` takes precedence over `default()` when both are set.

**Raw expressions in `ORDER BY`** — MergeTree `ORDER BY` clauses routinely include scalar function calls (`toDate(ts)`, `cityHash64(...)`, `intHash32(user_id)`) to control sparse-index cardinality. `orderBy(array)` restricts each entry to a plain identifier; use `orderByRaw(string)` to emit the full tuple verbatim:

```php
$schema->table('events')
->string('tenant')
->bigInteger('id')
->datetime('ts')
->orderByRaw('(`tenant`, toDate(`ts`), `id`)')
->create();

// CREATE TABLE `events` (`tenant` String, `id` Int64, `ts` DateTime)
// ENGINE = MergeTree() ORDER BY (`tenant`, toDate(`ts`), `id`)
```

The expression is emitted verbatim and must come from a trusted source. `orderByRaw()` takes precedence over `orderBy()` when both are set. Mirrors the existing `partitionBy(string)` convention. Only available on the ClickHouse builder.

**`rawColumn()` passthrough** — `Table::rawColumn(string $definition)` is the standard escape hatch for column types the builder does not yet model. It is honoured on every dialect, including ClickHouse:

```php
$schema->table('events')
->bigInteger('id')->primary()
->rawColumn('`payload` JSON CODEC(ZSTD(3))')
->create();

// CREATE TABLE `events` (`id` Int64, `payload` JSON CODEC(ZSTD(3))) ...
```

These OLAP-shaped modifiers live on the ClickHouse-specific `Column\ClickHouse` and `Table\ClickHouse` builders. Because the methods only exist on the dialect's own builder subclasses, calling `->lowCardinality()`, `->sampleBy()`, `->array()`, `->tuple()`, or `->orderByRaw()` on a `MySQL`, `PostgreSQL`, `SQLite`, or `MongoDB` builder fails at the type level, with no runtime branch needed.

### SQLite Schema

Expand Down
4 changes: 3 additions & 1 deletion src/Query/Schema.php
Original file line number Diff line number Diff line change
Expand Up @@ -308,7 +308,9 @@ protected function compileColumnDefinition(Column $column): string
$parts[] = 'NULL';
}

if ($column->hasDefault) {
if ($column->defaultRaw !== null) {
$parts[] = 'DEFAULT ' . $column->defaultRaw;
} elseif ($column->hasDefault) {
$parts[] = 'DEFAULT ' . $this->compileDefaultValue($column->default);
}

Expand Down
109 changes: 101 additions & 8 deletions src/Query/Schema/ClickHouse.php
Original file line number Diff line number Diff line change
Expand Up @@ -46,13 +46,52 @@ protected function compileColumnType(Column $column): string
return $type;
}

if ($column instanceof Column\ClickHouse && $column->arrayElementType !== null) {
if ($column->isLowCardinality) {
throw new UnsupportedException('LowCardinality is not supported inside Array(...). Wrap the element type instead.');
}

$inner = $this->compileNestedElementType($column->arrayElementType, $column);
$type = 'Array(' . $inner . ')';

if ($column->isNullable) {
$type = 'Nullable(' . $type . ')';
}

return $type;
}
Comment on lines +49 to +62
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 ClickHouse explicitly forbids wrapping Array with Nullable — the DDL Nullable(Array(T)) is rejected at the server level. The pattern mirrors the existing LowCardinality guard two lines above; the fix is to throw UnsupportedException there instead of silently emitting invalid DDL. The test testCreateTableArrayNullable passes today only because it checks the generated string, not whether ClickHouse accepts it.

Suggested change
if ($column instanceof Column\ClickHouse && $column->arrayElementType !== null) {
if ($column->isLowCardinality) {
throw new UnsupportedException('LowCardinality is not supported inside Array(...). Wrap the element type instead.');
}
$inner = $this->compileNestedElementType($column->arrayElementType, $column);
$type = 'Array(' . $inner . ')';
if ($column->isNullable) {
$type = 'Nullable(' . $type . ')';
}
return $type;
}
if ($column instanceof Column\ClickHouse && $column->arrayElementType !== null) {
if ($column->isLowCardinality) {
throw new UnsupportedException('LowCardinality is not supported inside Array(...). Wrap the element type instead.');
}
if ($column->isNullable) {
throw new UnsupportedException('Nullable(Array(...)) is not supported in ClickHouse. Use an empty array [] to represent a missing value instead.');
}
$inner = $this->compileNestedElementType($column->arrayElementType, $column);
$type = 'Array(' . $inner . ')';
return $type;
}


if ($column instanceof Column\ClickHouse && $column->tupleElementTypes !== []) {
if ($column->isLowCardinality) {
throw new UnsupportedException('LowCardinality is not supported on Tuple(...) columns.');
}

$inner = \implode(
', ',
\array_map(
fn (ColumnType $element): string => $this->compileNestedElementType($element, $column),
$column->tupleElementTypes,
),
);
$type = 'Tuple(' . $inner . ')';

if ($column->isNullable) {
$type = 'Nullable(' . $type . ')';
}

return $type;
}

$type = match ($column->type) {
ColumnType::String, ColumnType::Varchar, ColumnType::Relationship => 'String',
ColumnType::Text => 'String',
ColumnType::MediumText, ColumnType::LongText => 'String',
ColumnType::TinyInteger => $column->isUnsigned ? 'UInt8' : 'Int8',
ColumnType::SmallInteger => $column->isUnsigned ? 'UInt16' : 'Int16',
ColumnType::Integer => $column->isUnsigned ? 'UInt32' : 'Int32',
ColumnType::BigInteger, ColumnType::Id => $column->isUnsigned ? 'UInt64' : 'Int64',
ColumnType::Float, ColumnType::Double => 'Float64',
ColumnType::Decimal => 'Decimal(' . ($column->precision ?? 10) . ', ' . ($column->scale ?? 0) . ')',
ColumnType::Boolean => 'UInt8',
ColumnType::Datetime => $column->precision ? 'DateTime64(' . $column->precision . ')' : 'DateTime',
ColumnType::Timestamp => $column->precision ? 'DateTime64(' . $column->precision . ')' : 'DateTime',
Expand All @@ -62,9 +101,13 @@ protected function compileColumnType(Column $column): string
ColumnType::Point => 'Tuple(Float64, Float64)',
ColumnType::Linestring => 'Array(Tuple(Float64, Float64))',
ColumnType::Polygon => 'Array(Array(Tuple(Float64, Float64)))',
ColumnType::Uuid => 'UUID',
ColumnType::Uuid7 => 'FixedString(36)',
ColumnType::Vector => 'Array(Float64)',
ColumnType::Serial, ColumnType::BigSerial, ColumnType::SmallSerial => throw new UnsupportedException('SERIAL types are not supported in ClickHouse.'),
ColumnType::Array, ColumnType::Tuple => throw new UnsupportedException(
'Array/Tuple columns must be declared via Table\\ClickHouse::array() or ::tuple().'
),
};

if ($column instanceof Column\ClickHouse && $column->isLowCardinality) {
Expand Down Expand Up @@ -103,7 +146,9 @@ protected function compileColumnDefinition(Column $column): string
$this->compileColumnType($column),
];

if ($column->hasDefault) {
if ($column->defaultRaw !== null) {
$parts[] = 'DEFAULT ' . $column->defaultRaw;
} elseif ($column->hasDefault) {
$parts[] = 'DEFAULT ' . $this->compileDefaultValue($column->default);
}

Expand Down Expand Up @@ -211,6 +256,10 @@ public function compileCreate(Table $table, bool $ifNotExists = false): Statemen
$primaryKeys = \array_map(fn (string $c): string => $this->quote($c), $table->compositePrimaryKey);
}

foreach ($table->rawColumnDefs as $rawDef) {
$columnDefs[] = $rawDef;
}

foreach ($table->indexes as $index) {
if ($index->type !== IndexType::Index) {
throw new UnsupportedException(
Expand Down Expand Up @@ -239,13 +288,17 @@ public function compileCreate(Table $table, bool $ifNotExists = false): Statemen
}

if ($engine->requiresOrderBy()) {
$orderBy = ! empty($table->orderBy)
? \array_map(fn (string $c): string => $this->quote($c), $table->orderBy)
: $primaryKeys;

$sql .= ! empty($orderBy)
? ' ORDER BY (' . \implode(', ', $orderBy) . ')'
: ' ORDER BY tuple()';
if ($table instanceof Table\ClickHouse && $table->orderByRaw !== null) {
$sql .= ' ORDER BY ' . $table->orderByRaw;
} else {
$orderBy = ! empty($table->orderBy)
? \array_map(fn (string $c): string => $this->quote($c), $table->orderBy)
: $primaryKeys;

$sql .= ! empty($orderBy)
? ' ORDER BY (' . \implode(', ', $orderBy) . ')'
: ' ORDER BY tuple()';
}
}

if ($table instanceof Table\ClickHouse && $table->sampleBy !== null) {
Expand Down Expand Up @@ -350,6 +403,46 @@ private function compileEngine(Engine $engine, array $args): string
};
}

/**
* Compile an element type for use inside `Array(T)` or `Tuple(...)`.
*
* Element types come from the {@see ColumnType} enum directly, so they
* lack the per-column state (precision, unsigned flag, etc.) that
* {@see compileColumnType()} relies on. This helper falls back to the
* parent column's `isUnsigned` flag for integer elements and to the
* parent's `precision` for `Decimal` elements so callers can spell common
* shapes (`Array(UInt64)`, `Array(Decimal(18, 3))`) without leaking the
* inner-type complexity into the public API.
*/
private function compileNestedElementType(ColumnType $element, Column $parent): string
{
return match ($element) {
ColumnType::String, ColumnType::Varchar, ColumnType::Relationship,
ColumnType::Text, ColumnType::MediumText, ColumnType::LongText,
ColumnType::Json, ColumnType::Object, ColumnType::Binary => 'String',
ColumnType::TinyInteger => $parent->isUnsigned ? 'UInt8' : 'Int8',
ColumnType::SmallInteger => $parent->isUnsigned ? 'UInt16' : 'Int16',
ColumnType::Integer => $parent->isUnsigned ? 'UInt32' : 'Int32',
ColumnType::BigInteger, ColumnType::Id => $parent->isUnsigned ? 'UInt64' : 'Int64',
ColumnType::Float, ColumnType::Double => 'Float64',
ColumnType::Decimal => 'Decimal(' . ($parent->precision ?? 10) . ', ' . ($parent->scale ?? 0) . ')',
ColumnType::Boolean => 'UInt8',
ColumnType::Datetime, ColumnType::Timestamp => $parent->precision
? 'DateTime64(' . $parent->precision . ')'
: 'DateTime',
ColumnType::Uuid => 'UUID',
ColumnType::Uuid7 => 'FixedString(36)',
ColumnType::Point => 'Tuple(Float64, Float64)',
ColumnType::Linestring => 'Array(Tuple(Float64, Float64))',
ColumnType::Polygon => 'Array(Array(Tuple(Float64, Float64)))',
ColumnType::Vector => 'Array(Float64)',
ColumnType::Enum, ColumnType::Serial, ColumnType::BigSerial,
ColumnType::SmallSerial, ColumnType::Array, ColumnType::Tuple => throw new UnsupportedException(
'Nested element type ' . $element->value . ' is not supported inside Array/Tuple.'
),
};
}

/**
* @param string[] $values
*/
Expand Down
59 changes: 59 additions & 0 deletions src/Query/Schema/Column.php
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,13 @@ class Column

public protected(set) bool $hasDefault = false;

/**
* Raw default expression emitted verbatim after `DEFAULT` (e.g. `now()`,
* `generateUUIDv4()`, `gen_random_uuid()`). Distinct from {@see $default},
* which is rendered as a quoted literal.
*/
public protected(set) ?string $defaultRaw = null;

public protected(set) bool $isUnsigned = false;

public protected(set) bool $isUnique = false;
Expand Down Expand Up @@ -63,6 +70,7 @@ public function __construct(
public ColumnType $type,
public ?int $length = null,
public ?int $precision = null,
public ?int $scale = null,
) {
}

Expand All @@ -81,6 +89,33 @@ public function default(mixed $value): static
return $this;
}

/**
* Set a raw default expression rendered verbatim after `DEFAULT`.
*
* Use for dialect-specific server-generated defaults that {@see default()}
* would otherwise quote: `now()`, `CURRENT_TIMESTAMP`, `gen_random_uuid()`,
* `generateUUIDv4()`, etc. The expression is emitted unquoted and must come
* from a trusted (developer-controlled) source.
*
* @throws ValidationException if the expression is empty or contains ";".
*/
public function defaultRaw(string $expression): static
{
$trimmed = \trim($expression);

if ($trimmed === '') {
throw new ValidationException('Raw default expression must not be empty.');
}

if (\str_contains($trimmed, ';')) {
throw new ValidationException('Raw default expression must not contain ";".');
}

$this->defaultRaw = $trimmed;

return $this;
}

public function unsigned(): static
{
$this->isUnsigned = true;
Expand Down Expand Up @@ -285,6 +320,18 @@ public function longText(string $name): static
return $this->table->longText($name);
}

public function tinyInteger(string $name): static
{
/** @var static */
return $this->table->tinyInteger($name);
}

public function smallInteger(string $name): static
{
/** @var static */
return $this->table->smallInteger($name);
}

public function integer(string $name): static
{
/** @var static */
Expand All @@ -297,6 +344,18 @@ public function bigInteger(string $name): static
return $this->table->bigInteger($name);
}

public function decimal(string $name, int $precision = 10, int $scale = 0): static
{
/** @var static */
return $this->table->decimal($name, $precision, $scale);
}

public function uuid(string $name): static
{
/** @var static */
return $this->table->uuid($name);
}

public function serial(string $name): static
{
/** @var static */
Expand Down
Loading
Loading