Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/configuration/pgdog.toml/general.md
Original file line number Diff line number Diff line change
Expand Up @@ -410,6 +410,12 @@ Available options:

Default: **`auto`**

### `system_catalogs_omnisharded`

Enables sticky routing for system catalog tables and treats them as [omnisharded](../../features/sharding/omnishards.md) tables. This makes tools like `psql` work out of the box.

Default: **`true`** (enabled)

## Logging

### `log_connections`
Expand Down
8 changes: 7 additions & 1 deletion docs/configuration/pgdog.toml/sharded_tables.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,10 +83,16 @@ The data type of the column. Currently supported options are:

[Omnisharded](../../features/sharding/omnishards.md) tables are tables that have the same data on all shards. They typically are small and contain metadata, e.g., list of countries, cities, etc., and are used in joins. PgDog allows to read from these tables directly and load balances traffic evenly across all shards.

#### Example
By default, all tables unless otherwise configured as sharded, are considered omnisharded.

#### Sticky routing

Sticky routing disables round robin for omnisharded tables and sends the queries touching those tables to the same shard, guaranteeing consistent results for the duration of a client's connection:

```toml
[[omnisharded_tables]]
database = "prod"
sticky = true
tables = [
"settings",
"cities",
Expand Down
75 changes: 59 additions & 16 deletions docs/features/sharding/omnishards.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,47 +9,83 @@ Other names for these tables include **mirrored tables** and **replicated tables

## Configuration

Omnisharded tables are configured in [`pgdog.toml`](../../configuration/pgdog.toml/sharded_tables.md#omnisharded-tables):
Unless otherwise specified as a [sharded table](../../configuration/pgdog.toml/sharded_tables.md), all tables are omnisharded by default. This makes configuration simpler, and doesn't require explicitly enumerating all tables in `pgdog.toml`. For example:

```toml
[[omnisharded_tables]]
[[sharded_tables]]
database = "prod"
tables = [
"settings",
"cities",
"terms_of_service",
"ip_blocks",
]
column = "user_id"
```

## Query routing
This will configure all tables that have the `user_id` as sharded and all others as omnisharded.

### Query routing

Omnisharded tables are treated differently by the query router. Write queries are sent to all shards concurrently, while read queries are distributed evenly between shards using round robin.

If the query contains a sharding key, it will be used instead, and omnisharded tables in that query will be ignored.
For example, the following `INSERT` query will be sent to all shards concurrently:

```postgresql
INSERT INTO omnisharded_table (id, value) VALUES ($1, $2);
```

All configured shards will receive and store the same row. When reading that row, PgDog will choose one of the shards using the round robin algorithm, to distribute read load evenly.

#### Sharded and omnisharded tables

If a query references both sharded and omnisharded tables, the **sharded** table routing will take priority. Omnisharded tables are assumed to contain the same data on all shards, so joins referencing omnisharded tables will work as expected.

For example, assuming `users` table is sharded on the `id` column and `global_settings` table is omnisharded, the following query will be sent to the shard corresponding to the value of the `users.id` filter:

```postgresql
SELECT * FROM users
INNER JOIN global_settings ON global_settings.active = true
WHERE users.id = $1;
```

### Consistency

Writing data to omnisharded tables is atomic if you enable [two-phase commit](2pc.md).

If you can't or choose not to use 2pc, make sure writes to omnisharded tables can be repeated in case of failure. This can be achieved by using unique indexes and `INSERT ... ON CONFLICT ... DO UPDATE` queries.

Since reads from omnisharded tables are routed to individual shards, while a two-phase commit takes place, queries to these tables may return different results for a brief period of time.
Since data in all omnisharded tables is identical, no cross-shard indexes are necessary to achieve data integrity. You can use regular PostgreSQL `UNIQUE` indexes on individual shards.

!!! note "Eventual consistency"
Reads from omnisharded tables are routed to individual shards using round robin. While a two-phase commit takes place, different transactions may return different results for a brief period of time (usually less than a millisecond).


### Sticky routing

While most omnisharded tables should be identical on all shards, others could differ in subtle ways.

For example, if you configure system catalogs as omnisharded, e.g. to make Rails or other ORMs work out of the box, round robin query routing will return different results for each query.
For example, system catalogs (e.g. `pg_database`, `pg_class`, etc.) could have different OIDs for custom data types (e.g. `VECTOR`, `CREATE TYPE`) on different shards. To make Rails and some other ORMs work out of the box, you can enable sticky routing, which disables round robin and sends omnisharded queries to one shard for the duration of a client's connection.

For example:

When enabled, sticky routing will ensure that queries sent by a client to omnisharded tables will be consistently routed to the same shard, for the duration of the client connection.
```toml
[[omnisharded_tables]]
database = "prod"
sticky = true
tables = [
"pg_class",
"pg_database"
]
```

To enable it, configure your omnisharded tables as follows:
You can enable sticky routing for all omnisharded tables in [`pgdog.toml`](../../configuration/pgdog.toml/general.md#omnisharded_sticky):

```toml
[general]
omnisharded_sticky = true
```

The following system catalogs are using sticky routing by default:

```toml
[[omnisharded_tables]]
database = "prod"
sticky = true # Enable sticky routing for the following tables.
sticky = true
tables = [
"pg_class",
"pg_attribute",
Expand All @@ -70,4 +106,11 @@ tables = [
]
```

Once configured, commands like `\d`, `\d+` and others sent from `psql` will start to return correct results as well.
This is configurable with the `system_catalogs_omnisharded` setting in [`pgdog.toml`](../../configuration/pgdog.toml/general.md#system_catalogs_omnisharded):

```toml
[general]
system_catalogs_omnisharded = true
```

If enabled (it is by default), commands like `\d`, `\d+` and others sent from `psql` will start to return correct results.
2 changes: 1 addition & 1 deletion docs/features/transaction-mode.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ This is performed efficiently, and server parameters are updated only if they di
1. The database has a primary and replica(s)
2. The database has more than one shard
3. [`prepared_statements`](../configuration/pgdog.toml/general.md#prepared_statements) is set to `"full"`
4. [`query_parser_enabled`](../configuration/pgdog.toml/general.md#query_parser_enabled) is set to `true`
4. [`query_parser`](../configuration/pgdog.toml/general.md#query_parser_enabled) is set to `"on"`

This is to avoid unnecessary overhead of using `pg_query` (however small), when we don't absolutely have to.

Expand Down
12 changes: 7 additions & 5 deletions docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,14 +44,14 @@ Query engine provides a uniform view over multiple shards. Clients can use regul
| Feature | Status | Notes |
|----------|--------|-------|
| [Direct-to-shard reads](features/sharding/query-routing.md#select) | :material-check-circle-outline: | Sharding key must be specified in the query. |
| [Direct-to-shard writes](features/sharding/query-routing.md#insert) | :material-wrench: | Sharding key must be specified in the query. Multi-tuple `INSERT`s not supported yet. |
| [Direct-to-shard writes](features/sharding/query-routing.md#insert) | :material-check-circle-outline: | Sharding key must be specified in the query. Multi-tuple `INSERT`s are supported and sent to their respective shards automatically with a cross-shard query. Sharding key updates are supported for one row at a time. |
| [Cross-shard queries](features/sharding/cross-shard-queries/index.md) | :material-wrench: | Partial [aggregates](#aggregates) and [sorting](#sorting) support. CTEs & subqueries not supported yet. |
| Cross-shard CTEs | :material-calendar-check: | [#380](https://github.com/pgdogdev/pgdog/issues/380) |
| Cross-shard subqueries | :material-calendar-check: | [#381](https://github.com/pgdogdev/pgdog/issues/381) |
| Cross-shard joins | :material-calendar-check: | [#94](https://github.com/pgdogdev/pgdog/issues/94) |
| [Cross-shard transactions](features/sharding/2pc.md) | :material-wrench: | Supports [two-phase commit](features/sharding/2pc.md). Not benchmarked yet. |
| [Omnisharded tables](features/sharding/omnishards.md) | :material-wrench: | Unsharded tables with identical data on all shards. |
| Rewrite queries | :material-calendar-check: | Alter queries to support aggregate/sorting by rows not returned in result set. |
| Rewrite queries | :material-wrench: | Alter queries to support aggregate/sorting by rows not returned in result set. |
| [`COPY`](features/sharding/cross-shard-queries/copy.md) | :material-check-circle-outline: | Sharding key must be specified in the statement and the data. Supports text, CSV, and binary formats only. |
| Multi-statement queries | :material-calendar-check: | e.g.: `SELECT 1; SELECT 2;`. First query is used for routing only, entire request sent to the same shard(s). [#395](https://github.com/pgdogdev/pgdog/issues/395). |

Expand All @@ -66,7 +66,9 @@ Support for aggregate functions in [cross-shard](features/sharding/cross-shard-q
| `COUNT` | :material-check-circle-outline: | 〃 |
| `MIN` | :material-check-circle-outline: | 〃 |
| `MAX` | :material-check-circle-outline: | 〃 |
| `AVG` | :material-calendar-check: | [#434](https://github.com/pgdogdev/pgdog/issues/434) |
| `AVG` | :material-wrench: | Works in top level statement, but not in subqueries or CTEs. |
| `STDDEV` | :material-wrench: | 〃 |
| `VARIANCE` | :material-wrench: | 〃 |
| Percentile distributions | :material-close: | Could be expensive to calculate, need spill to disk. |

#### Sorting
Expand All @@ -87,8 +89,8 @@ Support for sorting rows in [cross-shard](features/sharding/cross-shard-queries/

| Feature | Status | Notes |
|-|-|-|
| [Data sync](features/sharding/resharding/hash.md) | :material-wrench: | Sync table data with logical replication. Not benchmarked yet. |
| [Schema sync](features/sharding/resharding/schema.md) | :material-wrench: | Sync table, index and constraint definitions. Not benchmarked yet. |
| [Data sync](features/sharding/resharding/hash.md) | :material-wrench: | Sync table data with logical replication. |
| [Schema sync](features/sharding/resharding/schema.md) | :material-wrench: | Sync table, index and constraint definitions. |
| Online rebalancing | :material-calendar-check: | Not automated yet, requires manual orchestration. |

### Schema & data integrity
Expand Down
Loading