Note: From revision 1.5.0, the HugeGraph-Store code has been adapted to this location.
HugeGraph Store is a distributed storage backend for HugeGraph that provides high availability, horizontal scalability, and strong consistency for production graph database deployments. Built on RocksDB and Apache JRaft, it serves as the data plane for large-scale graph workloads requiring enterprise-grade reliability.
- Distributed Storage: Hash-based partitioning with automatic data distribution across multiple Store nodes
- High Availability: Multi-replica data replication using Raft consensus, tolerating node failures without data loss
- Horizontal Scalability: Dynamic partition allocation and rebalancing for seamless cluster expansion
- Query Optimization: Advanced query pushdown (filter, aggregation, index) and multi-partition parallel execution
- Metadata Coordination: Tight integration with HugeGraph PD for cluster management and service discovery
- High Performance: gRPC-based communication with streaming support for large result sets
- Storage Engine: RocksDB 7.7.3 (optimized for graph workloads)
- Consensus Protocol: Apache JRaft (Ant Financial's Raft implementation)
- RPC Framework: gRPC + Protocol Buffers
- Deployment: Java 11+, Docker/Kubernetes support
Use Store for:
- Production deployments requiring high availability (99.9%+ uptime)
- Workloads exceeding single-node storage capacity (100GB+)
- Multi-tenant or high-concurrency scenarios (1000+ QPS)
- Environments requiring horizontal scalability and fault tolerance
Use RocksDB Backend for:
- Development and testing environments
- Single-node deployments with moderate data size (<100GB)
- Embedded scenarios where simplicity is preferred over distribution
HugeGraph Store is a Maven multi-module project consisting of 9 modules:
| Module | Description |
|---|---|
| hg-store-grpc | gRPC protocol definitions (7 .proto files) and generated Java stubs for Store communication |
| hg-store-common | Shared utilities, query abstractions, constants, and buffer management |
| hg-store-rocksdb | RocksDB abstraction layer with session management and optimized scan iterators |
| hg-store-core | Core storage engine: partition management, Raft integration, metadata coordination, business logic |
| hg-store-client | Java client library for applications to connect to Store cluster and perform operations |
| hg-store-node | Store node server implementation with gRPC services, Raft coordination, and PD integration |
| hg-store-cli | Command-line utilities for Store administration and debugging |
| hg-store-test | Comprehensive unit and integration tests for all Store components |
| hg-store-dist | Distribution assembly: packaging, configuration templates, startup scripts |
Client Layer (hugegraph-server)
↓ (hg-store-client connects via gRPC)
Store Node Layer (hg-store-node)
├─ gRPC Services (Session, Query, State)
├─ Partition Engines (each partition = one Raft group)
└─ PD Integration (heartbeat, partition assignment)
↓
Storage Engine Layer (hg-store-core + hg-store-rocksdb)
├─ HgStoreEngine (manages all partition engines)
├─ PartitionEngine (per-partition Raft state machine)
└─ RocksDB (persistent storage)
- Partition-based Distribution: Data is split into partitions (default: hash-based) and distributed across Store nodes
- Raft Consensus per Partition: Each partition is a separate Raft group with 1-3 replicas (typically 3 in production)
- PD Coordination: Store nodes register with PD for partition assignment, metadata synchronization, and health monitoring
- Query Pushdown: Filters, aggregations, and index scans are pushed to Store nodes for parallel execution
For detailed architecture, Raft consensus mechanisms, and partition management, see Distributed Architecture.
- Java: 11 or higher
- Maven: 3.5 or higher
- HugeGraph PD Cluster: Store requires a running PD cluster for metadata coordination (see PD README)
- Disk Space: At least 10GB per Store node for data and Raft logs
- Network: Low-latency network (<5ms) between Store nodes for Raft consensus
Important: Build hugegraph-struct first, as it's a required dependency.
From the project root:
# Build struct module
mvn install -pl hugegraph-struct -am -DskipTests
# Build Store and all dependencies
mvn clean package -pl hugegraph-store/hugegraph-store-dist -am -DskipTestsThe assembled distribution will be available at:
hugegraph-store/apache-hugegraph-store-incubating-1.7.0/lib/hg-store-node-1.7.0.jar```
Extract the distribution package and edit conf/application.yml:
| Parameter | Default | Description |
|---|---|---|
pdserver.address |
localhost:8686 |
Required: PD cluster endpoints (comma-separated, e.g., 192.168.1.10:8686,192.168.1.11:8686) |
grpc.host |
127.0.0.1 |
gRPC server bind address (use actual IP for production) |
grpc.port |
8500 |
gRPC server port for client connections |
raft.address |
127.0.0.1:8510 |
Raft service address for this Store node |
raft.snapshotInterval |
1800 |
Raft snapshot interval in seconds (30 minutes) |
server.port |
8520 |
REST API port for management and metrics |
app.data-path |
./storage |
Directory for RocksDB data storage (supports multiple paths for multi-disk setups) |
app.fake-pd |
false |
Enable built-in PD mode for standalone testing (not for production) |
pdserver:
address: localhost:8686 # Ignored when fake-pd is true
grpc:
host: 127.0.0.1
port: 8500
raft:
address: 127.0.0.1:8510
snapshotInterval: 1800
server:
port: 8520
app:
data-path: ./storage
fake-pd: true # Built-in PD mode (development only)Prerequisites: A running 3-node PD cluster at 192.168.1.10:8686, 192.168.1.11:8686, 192.168.1.12:8686
Store Node 1 (192.168.1.20):
pdserver:
address: 192.168.1.10:8686,192.168.1.11:8686,192.168.1.12:8686
grpc:
host: 192.168.1.20
port: 8500
raft:
address: 192.168.1.20:8510
app:
data-path: ./storage
fake-pd: falseStore Node 2 (192.168.1.21):
pdserver:
address: 192.168.1.10:8686,192.168.1.11:8686,192.168.1.12:8686
grpc:
host: 192.168.1.21
port: 8500
raft:
address: 192.168.1.21:8510
app:
data-path: ./storage
fake-pd: falseStore Node 3 (192.168.1.22):
pdserver:
address: 192.168.1.10:8686,192.168.1.11:8686,192.168.1.12:8686
grpc:
host: 192.168.1.22
port: 8500
raft:
address: 192.168.1.22:8510
app:
data-path: ./storage
fake-pd: falseFor detailed configuration options, RocksDB tuning, and deployment topologies, see Deployment Guide.
Start the Store server:
# Replace {version} with your hugegraph version
cd apache-hugegraph-store-incubating-{version}
# Start Store node
bin/start-hugegraph-store.sh
# Stop Store node
bin/stop-hugegraph-store.sh
# Restart Store node
bin/restart-hugegraph-store.shbin/start-hugegraph-store.sh [-g GC_TYPE] [-j "JVM_OPTIONS"]-g: GC type (g1orZGC, default:g1)-j: Custom JVM options (e.g.,-j "-Xmx16g -Xms8g")
Default JVM memory settings (defined in start-hugegraph-store.sh):
- Max heap: 32GB
- Min heap: 512MB
Check if Store is running and registered with PD:
# Check process
ps aux | grep hugegraph-store
# Test gRPC endpoint (requires grpcurl)
grpcurl -plaintext localhost:8500 list
# Check REST API health
curl http://localhost:8520/v1/health
# Check logs
tail -f logs/hugegraph-store.log
# Verify registration with PD (from PD node)
curl http://localhost:8620/v1/storesFor production deployment, see Deployment Guide and Best Practices.
HugeGraph Store serves as a pluggable backend for HugeGraph Server. To use Store as the backend:
Edit hugegraph-server/conf/graphs/<graph-name>.properties:
# Backend configuration
backend=hstore
serializer=binary
# Store connection (PD addresses)
store.provider=org.apache.hugegraph.backend.store.hstore.HstoreProvider
store.pd_peers=192.168.1.10:8686,192.168.1.11:8686,192.168.1.12:8686
# Connection pool settings
store.max_sessions=4
store.session_timeout=30000Ensure PD and Store clusters are running, then start HugeGraph Server:
cd hugegraph-server
bin/init-store.sh # Initialize schema
bin/start-hugegraph.sh# Check backend via REST API
curl --location --request GET 'http://localhost:8080/metrics/backend' \
--header 'Authorization: Bearer <YOUR_ACCESS_TOKEN>'
# Response should show:
# {"backend": "hstore", "nodes": [...]}Run Store tests:
# All tests (from hugegraph root)
mvn test -pl hugegraph-store/hg-store-test -am
# Specific test module
mvn test -pl hugegraph-store/hg-store-test -am -Dtest=HgStoreEngineTest
# From hugegraph-store directory
cd hugegraph-store
mvn testStore tests are organized into 6 profiles (all active by default):
store-client-test: Client library testsstore-core-test: Core storage and partition management testsstore-common-test: Common utilities and query abstraction testsstore-rocksdb-test: RocksDB abstraction layer testsstore-server-test: Store node server and gRPC service testsstore-raftcore-test: Raft consensus integration tests
For development workflows and debugging, see Development Guide.
From the project root:
docker build -f hugegraph-store/Dockerfile -t hugegraph-store:latest .docker run -d \
-p 8520:8520 \
-p 8500:8500 \
-p 8510:8510 \
-v /path/to/conf:/hugegraph-store/conf \
-v /path/to/storage:/hugegraph-store/storage \
-e PD_ADDRESS=192.168.1.10:8686,192.168.1.11:8686 \
--name hugegraph-store \
hugegraph-store:latestExposed Ports:
8520: REST API (management, metrics)8500: gRPC (client connections)8510: Raft consensus
For a complete HugeGraph distributed deployment (PD + Store + Server), see:
hugegraph-server/hugegraph-dist/docker/example/
For Docker and Kubernetes deployment details, see Deployment Guide.
Comprehensive documentation for HugeGraph Store:
| Documentation | Description |
|---|---|
| Distributed Architecture | Deep dive into three-tier architecture, Raft consensus, partition management, and PD coordination |
| Deployment Guide | Production deployment topologies, configuration reference, Docker/Kubernetes setup |
| Integration Guide | Integrating Store with HugeGraph Server, client API usage, migrating from other backends |
| Query Engine | Query pushdown mechanisms, multi-partition queries, gRPC API reference |
| Operations Guide | Monitoring and metrics, troubleshooting common issues, backup and recovery, rolling upgrades |
| Best Practices | Hardware sizing, performance tuning, security configuration, high availability design |
| Development Guide | Development environment setup, module architecture, testing strategies, contribution workflow |
Minimum Cluster (development/testing):
- 3 PD nodes
- 3 Store nodes
- 1-3 Server nodes
Recommended Production Cluster:
- 3-5 PD nodes (odd number for Raft quorum)
- 6-12 Store nodes (depends on data size and throughput)
- 3-6 Server nodes (depends on query load)
Large-Scale Cluster:
- 5 PD nodes
- 12+ Store nodes (horizontal scaling)
- 6+ Server nodes (load balancing)
- Store uses Raft consensus for leader election and data replication
- Each partition has 1-3 replicas (default: 3 in production)
- Cluster can tolerate up to
(N-1)/2Store node failures per partition (e.g., 1 failure in 3-replica setup) - Automatic failover and leader re-election (typically <10 seconds)
- PD provides cluster-wide coordination and metadata consistency
- Default Partitioning: Hash-based (configurable in PD)
- Partition Count: Recommended 3-5x the number of Store nodes for balanced distribution
- Replica Count: 3 replicas per partition for production (configurable)
- Rebalancing: Automatic partition rebalancing triggered by PD patrol (default: 30 minutes interval)
- Latency: <5ms between Store nodes for Raft consensus performance
- Bandwidth: 1Gbps+ recommended for data replication and query traffic
- Ports: Ensure firewall allows traffic on 8500 (gRPC), 8510 (Raft), 8520 (REST)
- Topology: Consider rack-aware or availability-zone-aware placement for fault isolation
Store exposes metrics via:
- REST API:
http://<store-host>:8520/actuator/metrics - Health Check:
http://<store-host>:8520/actuator/health - Prometheus Integration: Metrics exported in Prometheus format
Key Metrics to Monitor:
- Raft leader election count and duration
- Partition count and distribution
- RocksDB read/write latency and throughput
- gRPC request QPS and error rate
- Disk usage and I/O metrics
For detailed operational guidance, see Operations Guide and Best Practices.
- Website: https://hugegraph.apache.org
- Documentation: https://hugegraph.apache.org/docs/
- GitHub: https://github.com/apache/hugegraph
- Mailing List: dev@hugegraph.apache.org
- Issue Tracker: https://github.com/apache/hugegraph/issues
Contributions are welcome! Please read our Development Guide and follow the Apache HugeGraph contribution guidelines.
For development workflows, code structure, and testing strategies, see the Development Guide.
HugeGraph Store is licensed under the Apache License 2.0.
HugeGraph Store is under active development. Please report issues via GitHub or the mailing list.