Catena - Decentralised Catalogue Coordinator

Catena is a distributed catalogue coordination system that retrieves offerings from DLT-Booth, distributes data across catalogue nodes using consistent hashing, and provides federated SPARQL querying capabilities.

Architecture

The coordinator consists of several key components organized in a clean, modular structure:

Core Components

Main Coordinator (main.py): Orchestrates all background workers and the API server
API Layer (api/): Flask-based REST API with SPARQL federation
DLT Communication (utils/dlt_comm/): Handles communication with DLT-Booth API
Monitoring (utils/monitoring/): Node health monitoring and failover handling
Workers (utils/workers/): Background workers for offerings processing
Hashring (utils/hashring/): Consistent hashing for data distribution
Redis (utils/redis/): Redis client with graceful fallback to in-memory storage

Directory Structure

├── api
│   ├── __init__.py
│   └── offerings_retrieval.py
├── config.py
├── docker
│   ├── docker-compose.yml
│   ├── Dockerfile
│   └── start.sh
├── examples
│   ├── catalogue_list.json
│   └── example.env
├── LICENSE
├── main.py
├── README.md
├── requirements.txt
└── utils
    ├── __init__.py
    ├── dlt_comm
    │   ├── get_nodes.py
    │   ├── __init__.py
    │   └── offering_processor.py
    ├── hash_ring
    │   ├── consistent_hash.py
    │   └── __init__.py
    ├── node_monitor
    │   ├── health_checker.py
    │   └── __init__.py
    └── workers
        ├── data_processor.py
        ├── __init__.py
        └── worker_pool.py

Data Flow

Offerings Retrieval: Coordinator fetches addresses from DLT_BASE_URL/offerings at configurable intervals
Data Distribution: Each offering is fetched and distributed to catalogue nodes using consistent hashing
Redundancy: Data is replicated across multiple nodes (configurable via REDUNDANCY_REPLICAS)
Node Monitoring: Continuous health checks detect node failures and trigger data redistribution
Federated Queries: SPARQL queries are distributed across all active catalogue nodes

Environment Variables

Coordinator Configuration

Variable	Description	Default / Example
`SUBPROCESS_HEALTH_CHECK_INTERVAL`	Interval (seconds) for checking subprocess health.	`5`
`WORKER_POOL_SIZE`	Number of worker threads/processes in the pool.	`10`
`OPERATOR_PROVIDED`	Coordinator mode toggle (0 = disabled, 1 = enabled).	`0`

Flask API Configuration

Variable	Description	Default / Example
`HOST_ADDRESS`	Address for the Flask API to bind to.	`0.0.0.0`
`HOST_PORT`	Port for the Flask API to listen on.	`3030`

Global Catalogue Configuration

Variable	Description	Default / Example
`GC_URL`	Base URL for the Global Catalogue	`http://global-catalogue`
`GC_PORT`	Port for the Global Catalogue	`3030`

DLT Configuration

Variable	Description	Default / Example
`DLT_BASE_URL`	Base URL for DLT Booth API.	`http://dlt-booth:8085/api`
`DLT_RUST_LOG`	Log level for DLT (`debug`, `info`, `error`).	`debug`
`DLT_RUST_BACKTRACE`	Enable backtrace on errors (`0` = off, `1` = on).	`1`
`DLT_HOST_ADDRESS`	DLT Booth HTTP server bind address.	`0.0.0.0`
`DLT_HOST_PORT`	DLT Booth HTTP server port.	`8085`
`DLT_NODE_URL`	DLT node endpoint URL.	`https://example.com/node`
`DLT_FAUCET_API_ENDPOINT`	Faucet API endpoint.	`https://example.com/faucet/`
`DLT_RPC_PROVIDER`	RPC provider endpoint.	`https://example.com/rpc`
`DLT_CHAIN_ID`	Chain ID for the DLT network.	`1000`
`DLT_ISSUER_URL`	Issuer service endpoint.	`https://example.com/issuer`

Redis Configuration

Variable	Description	Default / Example
`REDIS_HOST`	Host address of Redis instance.	`catalogue-coordinator-redis`
`REDIS_PORT`	Redis port.	`6379`
`REDIS_DB`	Redis database index.	`0`

Offering Configuration

Variable	Description	Default / Example
`OFFERING_DESC_TIMEOUT`	Timeout (seconds) for fetching offering description.	`60`
`OFFERING_FETCH_INTERVAL`	Interval (seconds) between offering fetch cycles.	`60`
`OFFERING_REPLICA_COUNT`	Number of replicas per offering.	`2`

Node Monitoring Configuration

Variable	Description	Default / Example
`NODE_HEALTH_CHECK_INTERVAL`	Interval (seconds) for node health checks.	`30`
`NODE_GRACE_PERIOD`	Grace period (seconds) before marking node unhealthy.	`60`
`NODE_TIMEOUT`	Timeout (seconds) for node response.	`10`

Hash Ring Configuration

Variable	Description	Default / Example
`HASH_RING_VIRTUAL_NODES`	Number of virtual nodes in the consistent hash ring.	`150`

Key Storage Configuration

Variable	Description	Default / Example
`DLT_KEY_STORAGE_STRONGHOLD_SNAPSHOT_PATH`	Path to key storage snapshot file.	`./key_storage.stronghold`
`DLT_KEY_STORAGE_STRONGHOLD_PASSWORD`	Password for encrypting the key storage snapshot.	`some_hopefully_secure_password`
`DLT_KEY_STORAGE_MNEMONIC`	Mnemonic used to generate the key storage.	`your mnemonic here`

Wallet Configuration

Variable	Description	Default / Example
`DLT_WALLET_STRONGHOLD_SNAPSHOT_PATH`	Path to wallet storage snapshot file.	`./wallet.stronghold`
`DLT_WALLET_STRONGHOLD_PASSWORD`	Password for encrypting the wallet snapshot.	`some_hopefully_secure_password`

Database Configuration

Variable	Description	Default / Example
`DLT_BOOTH_DB_USER`	Username for DLT Booth database connection.	`postgres`
`DLT_BOOTH_DB_PASSWORD`	Password for DLT Booth database connection.	`dlt_booth`

Configuration

Create an .env file (use ./env/example.env as a reference):

Alternatively, you can create a custom .env file with the help of the enviroment variable descriptions

Installation & Usage

Option 1: Docker Compose (Recommended)

Start with Docker (includes Redis):
```
cd docker
bash start.sh
```
Stop services:
```
cd docker
docker-compose down
```
View logs:
```
cd docker
docker-compose logs -f
```

Option 2: Local Development

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python3 main.py

API Endpoints

Health Check

GET /health - Service health status

Offerings

POST /offerings - Retrieve offerings by ID (JSON body: {"offerings_id": "..."})

SPARQL Federation

POST /sparql - Execute federated SPARQL queries across all catalogue nodes

Features

Retry Logic: Uses tenacity for robust retry mechanisms with exponential backoff
Caching: Implements TTL-based caching for API responses
Failover: Automatic data redistribution when nodes go down
Consistent Hashing: Uses consistent hashing for stable data distribution
Graceful Degradation: Falls back to in-memory storage if Redis is unavailable
Configurable Redundancy: Supports multiple data replicas for high availability
Docker Support: Containerized deployment with Redis as separate service
Health Monitoring: Built-in health checks for Docker orchestration

Notes

The coordinator has two modes, set using the variable OPERATOR_PROVIDED and CENTRALISED

OPERATOR_PROVIDED: 0: Enables automatic node retrieval mode where catalogue nodes are inferred and retrieved from the DLT offerings
OPERATOR_PROVIDED: 1: Defaults to using known catalogues and refers to catalogue nodes from ./catalogue_list.json (example file under ./examples directory)
CENTRALISED: 0: Enables decentralised mode for offerings to be spread across multiple nodes
CENTRALISED: 1: Uses Global Catalogue to store all offering descriptions

TODO

Federated Query Support: Add support for all subqueries, not just SELECT
Add /profile call step: Add an additional call step to provider to fetch catalogue endpoints
Clarify federated SPARQL query: Clarify if consumers directly call GC
Add tests: Add test cases for all submodules

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Catena - Decentralised Catalogue Coordinator

Architecture

Core Components

Directory Structure

Data Flow

Environment Variables

Coordinator Configuration

Flask API Configuration

Global Catalogue Configuration

DLT Configuration

Redis Configuration

Offering Configuration

Node Monitoring Configuration

Hash Ring Configuration

Key Storage Configuration

Wallet Configuration

Database Configuration

Configuration

Installation & Usage

Option 1: Docker Compose (Recommended)

Option 2: Local Development

API Endpoints

Health Check

Offerings

SPARQL Federation

Features

Notes

TODO

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.github/workflows		.github/workflows
api		api
docker		docker
examples		examples
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
main.py		main.py
requirements.txt		requirements.txt

License

Sedimark/catena

Folders and files

Latest commit

History

Repository files navigation

Catena - Decentralised Catalogue Coordinator

Architecture

Core Components

Directory Structure

Data Flow

Environment Variables

Coordinator Configuration

Flask API Configuration

Global Catalogue Configuration

DLT Configuration

Redis Configuration

Offering Configuration

Node Monitoring Configuration

Hash Ring Configuration

Key Storage Configuration

Wallet Configuration

Database Configuration

Configuration

Installation & Usage

Option 1: Docker Compose (Recommended)

Option 2: Local Development

API Endpoints

Health Check

Offerings

SPARQL Federation

Features

Notes

TODO

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages