[CRE-1601] shard-orchestrator implementation #1747

agparadiso · 2026-01-02T16:10:10Z

ShardOrchestrator gRPC Service Implementation

Implements the ShardOrchestrator gRPC service that exposes workflow-to-shard routing decisions from Ring OCR consensus to other shards in the system.

Changes

Proto Definition Location
- Moved shard_orchestrator.proto from ring/pb/ → shardorchestrator/pb/
  Reason: Proto should live with its service implementation, not in the Ring OCR package
Service Implementation
- Store: Thread-safe in-memory store with version tracking for cache invalidation
  gRPC
- Server: Implements batch workflow mapping queries and shard registration reporting
  Test Helpers (test_adapter.go): Test utilities for simulating Ring OCR updates
Data Flow
Ring OCR consensus → Ring Transmitter → Dual update:
- Ring's internal store (for consensus logic)
- ShardOrchestrator store (for gRPC exposure to other shards)

CRE-1601

Requires

Supports

github-actions · 2026-01-02T16:11:11Z

⚠️ API Diff Results - Breaking changes detected

📦 Module: `github-com-smartcontractkit-chainlink-common`

🔴 Breaking Changes (17)

`pkg/workflows/ring` (2)

NewFactory — Type changed:

func(
  *Store, 
  + *github.com/smartcontractkit/chainlink-common/pkg/workflows/shardorchestrator.Store, 
  github.com/smartcontractkit/chainlink-common/pkg/workflows/ring/pb.ArbiterScalerClient, 
  github.com/smartcontractkit/chainlink-common/pkg/logger.Logger, 
  *ConsensusConfig
)
(*Factory, error)

NewTransmitter — Type changed:

func(
  github.com/smartcontractkit/chainlink-common/pkg/logger.Logger, 
  *Store, 
  + *github.com/smartcontractkit/chainlink-common/pkg/workflows/shardorchestrator.Store, 
  github.com/smartcontractkit/chainlink-common/pkg/workflows/ring/pb.ArbiterScalerClient, 
  github.com/smartcontractkit/libocr/offchainreporting2plus/types.Account
)
*Transmitter

`pkg/workflows/ring/pb` (15)

File_shard_orchestrator_proto — 🗑️ Removed
GetWorkflowShardMappingRequest — 🗑️ Removed
GetWorkflowShardMappingResponse — 🗑️ Removed
NewShardOrchestratorServiceClient — 🗑️ Removed
RegisterShardOrchestratorServiceServer — 🗑️ Removed
ReportWorkflowTriggerRegistrationRequest — 🗑️ Removed
ReportWorkflowTriggerRegistrationResponse — 🗑️ Removed
ShardOrchestratorService_GetWorkflowShardMapping_FullMethodName — 🗑️ Removed
ShardOrchestratorService_ReportWorkflowTriggerRegistration_FullMethodName — 🗑️ Removed
ShardOrchestratorService_ServiceDesc — 🗑️ Removed
ShardOrchestratorServiceClient — 🗑️ Removed
ShardOrchestratorServiceServer — 🗑️ Removed
UnimplementedShardOrchestratorServiceServer — 🗑️ Removed
UnsafeShardOrchestratorServiceServer — 🗑️ Removed
WorkflowMappingState — 🗑️ Removed

📄 View full apidiff report

pkg/workflows/ring/transmitter.go

Copilot

Pull request overview

This PR implements the ShardOrchestrator gRPC service that exposes workflow-to-shard routing decisions from Ring OCR consensus to other shards in the system. The implementation moves the proto definition from the Ring package to a dedicated shardorchestrator package and establishes a dual-update pattern where Ring's transmitter updates both its internal store and the ShardOrchestrator store for external consumption.

Key changes:

Moved proto definitions and generated code to shardorchestrator/pb/ package for better separation of concerns
Implemented thread-safe in-memory store with version tracking for cache invalidation
Created gRPC server and client implementations for cross-shard communication

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`pkg/workflows/shardorchestrator/store.go`	New in-memory store managing workflow-to-shard mappings with version tracking
`pkg/workflows/shardorchestrator/service.go`	gRPC server implementation exposing workflow mappings and accepting shard registrations
`pkg/workflows/shardorchestrator/client.go`	gRPC client for querying workflow mappings from shard 0
`pkg/workflows/shardorchestrator/pb/*.proto`	Relocated proto definitions from ring package with timestamp fields removed
`pkg/workflows/ring/transmitter.go`	Updated to perform dual updates to both ring store and orchestrator store
`pkg/workflows/ring/factory.go`	Updated constructor to accept orchestrator store parameter
`pkg/workflows/shardorchestrator/*_test.go`	Comprehensive test coverage for store, service, client, and integration scenarios

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pkg/workflows/shardorchestrator/store.go

Copilot · 2026-01-07T14:37:56Z

pkg/workflows/ring/transmitter.go

+
+			existingMapping, err := t.shardOrchestratorStore.GetWorkflowMapping(ctx, workflowID)
+			if err != nil {
+				// New workflow - no previous assignment


The comment "New workflow - no previous assignment" is misleading because the error could represent any retrieval failure, not just missing workflows. Consider checking specifically for "not found" errors versus other error types, or update the comment to reflect that any error is treated as a new workflow.

Suggested change

// New workflow - no previous assignment

// Treat any retrieval error as no previous assignment (new workflow or store failure)

don't agree, the GetWorkflowMapping implementation only fails if the workflowID is not present in the map. the comment is correct

Copilot · 2026-01-07T14:37:56Z

pkg/workflows/ring/transmitter.go

+
+		if err := t.shardOrchestratorStore.BatchUpdateWorkflowMappings(ctx, mappings); err != nil {
+			t.lggr.Errorw("failed to update ShardOrchestrator store", "err", err, "workflowCount", len(mappings))
+			// Don't fail the entire transmission if ShardOrchestrator update fails


The comment "Don't fail the entire transmission" explains the behavior but doesn't explain the rationale. Consider documenting why it's safe to continue despite this failure, such as "ShardOrchestrator updates are for external consumption only and don't affect Ring's internal consensus state."

Suggested change

// Don't fail the entire transmission if ShardOrchestrator update fails

// Don't fail the entire transmission if ShardOrchestrator update fails.

// The Ring store is the authoritative source of routing state and workflow mappings.

// ShardOrchestrator updates are best-effort, external-facing metadata that do not

// participate in Ring's internal consensus or routing decisions, so failing here

// would unnecessarily impact consensus progress without improving safety.

don't agree, its too verbose. the original comment explain why we don't return and it makes clear this is intentional.

mchain0

pls just fix suggestions from copilot

george-dorin · 2026-01-07T14:52:14Z

pkg/workflows/ring/transmitter.go

+	// Determine if system is in transition state
+	systemInTransition := false
+	if outcome.State != nil {
+		if _, ok := outcome.State.State.(*pb.RoutingState_Transition); ok {


We should log a warning if the conversion fails

Its not converting it, is a type assertion that is basically asking: "is the state of type *pb.RoutingState_Transition? if so I'll mark the systemInTranstion = true"

bolekk · 2026-01-09T03:57:37Z

pkg/workflows/shardorchestrator/store.go

+	WorkflowID      string
+	OldShardID      uint32
+	NewShardID      uint32
+	TransitionState TransitionState


I'm late to the game here but why are we storing the state separately for each workflow? The whole system is either in steady or transitioning state.

agparadiso temporarily deployed to integration January 2, 2026 16:10 — with GitHub Actions Inactive

agparadiso had a problem deploying to integration January 2, 2026 16:10 — with GitHub Actions Failure

agparadiso temporarily deployed to integration January 2, 2026 16:10 — with GitHub Actions Inactive

agparadiso force-pushed the cre-1601-shard-orchestrator-implementation branch from 3d888b8 to aecf0cb Compare January 2, 2026 16:20

agparadiso had a problem deploying to integration January 2, 2026 16:20 — with GitHub Actions Error

agparadiso force-pushed the cre-1601-shard-orchestrator-implementation branch from aecf0cb to e193376 Compare January 2, 2026 16:21

agparadiso temporarily deployed to integration January 2, 2026 16:21 — with GitHub Actions Inactive

agparadiso temporarily deployed to integration January 2, 2026 17:09 — with GitHub Actions Inactive

agparadiso had a problem deploying to integration January 2, 2026 17:09 — with GitHub Actions Failure

agparadiso temporarily deployed to integration January 2, 2026 17:09 — with GitHub Actions Inactive

Base automatically changed from cre-1601-shard-orchestrator-plugin to main January 5, 2026 08:51

agparadiso force-pushed the cre-1601-shard-orchestrator-implementation branch from bcdcbcb to 4526ebc Compare January 5, 2026 09:51

agparadiso temporarily deployed to integration January 5, 2026 09:51 — with GitHub Actions Inactive

github-code-quality bot found potential problems Jan 5, 2026

View reviewed changes

pkg/workflows/ring/transmitter.go Fixed Show fixed Hide fixed

agparadiso force-pushed the cre-1601-shard-orchestrator-implementation branch from 4526ebc to 273098b Compare January 5, 2026 10:19

agparadiso had a problem deploying to integration January 5, 2026 10:19 — with GitHub Actions Error

agparadiso force-pushed the cre-1601-shard-orchestrator-implementation branch from 273098b to f78484e Compare January 5, 2026 10:21

agparadiso temporarily deployed to integration January 5, 2026 10:21 — with GitHub Actions Inactive

agparadiso changed the title ~~wip shard-orchestrator~~ [CRE-1601] shard-orchestrator implementation Jan 5, 2026

agparadiso temporarily deployed to integration January 7, 2026 14:19 — with GitHub Actions Inactive

agparadiso marked this pull request as ready for review January 7, 2026 14:37

agparadiso requested a review from a team as a code owner January 7, 2026 14:37

Copilot AI review requested due to automatic review settings January 7, 2026 14:37

Copilot AI reviewed Jan 7, 2026

View reviewed changes

agparadiso requested a review from mchain0 January 7, 2026 14:37

mchain0 previously approved these changes Jan 7, 2026

View reviewed changes

george-dorin reviewed Jan 7, 2026

View reviewed changes

george-dorin previously approved these changes Jan 7, 2026

View reviewed changes

fix: remove reduntant check

5f6b67b

agparadiso dismissed stale reviews from george-dorin and mchain0 via 5f6b67b January 7, 2026 16:18

agparadiso temporarily deployed to integration January 7, 2026 16:18 — with GitHub Actions Inactive

george-dorin approved these changes Jan 7, 2026

View reviewed changes

agparadiso mentioned this pull request Jan 7, 2026

[CRE-1610] wire up shard_orchestrator client smartcontractkit/chainlink#20736

Draft

mchain0 approved these changes Jan 8, 2026

View reviewed changes

Merge branch 'main' into cre-1601-shard-orchestrator-implementation

bea8284

mchain0 enabled auto-merge January 8, 2026 09:59

mchain0 had a problem deploying to integration January 8, 2026 09:59 — with GitHub Actions Failure

mchain0 temporarily deployed to integration January 8, 2026 09:59 — with GitHub Actions Inactive

mchain0 added this pull request to the merge queue Jan 8, 2026

Merged via the queue into main with commit 7044bd4 Jan 8, 2026
34 of 36 checks passed

mchain0 deleted the cre-1601-shard-orchestrator-implementation branch January 8, 2026 10:16

bolekk reviewed Jan 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CRE-1601] shard-orchestrator implementation #1747

[CRE-1601] shard-orchestrator implementation #1747

agparadiso commented Jan 2, 2026 •

edited by atlassian bot

Loading

Uh oh!

github-actions bot commented Jan 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Jan 7, 2026

Uh oh!

agparadiso Jan 7, 2026

Uh oh!

Copilot AI Jan 7, 2026

Uh oh!

agparadiso Jan 7, 2026

Uh oh!

mchain0 left a comment

Uh oh!

george-dorin Jan 7, 2026

Uh oh!

agparadiso Jan 7, 2026

Uh oh!

Uh oh!

bolekk Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	// New workflow - no previous assignment
	// Treat any retrieval error as no previous assignment (new workflow or store failure)

-			// Don't fail the entire transmission if ShardOrchestrator update fails
+			// Don't fail the entire transmission if ShardOrchestrator update fails.
+			// The Ring store is the authoritative source of routing state and workflow mappings.
+			// ShardOrchestrator updates are best-effort, external-facing metadata that do not
+			// participate in Ring's internal consensus or routing decisions, so failing here
+			// would unnecessarily impact consensus progress without improving safety.

[CRE-1601] shard-orchestrator implementation #1747

[CRE-1601] shard-orchestrator implementation #1747

Conversation

agparadiso commented Jan 2, 2026 • edited by atlassian bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ShardOrchestrator gRPC Service Implementation

Changes

Requires

Supports

Uh oh!

github-actions bot commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ API Diff Results - Breaking changes detected

📦 Module: github-com-smartcontractkit-chainlink-common

🔴 Breaking Changes (17)

pkg/workflows/ring (2)

pkg/workflows/ring/pb (15)

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

agparadiso Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

agparadiso Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

mchain0 left a comment

Choose a reason for hiding this comment

Uh oh!

george-dorin Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

agparadiso Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bolekk Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

agparadiso commented Jan 2, 2026 •

edited by atlassian bot

Loading

github-actions bot commented Jan 2, 2026 •

edited

Loading

📦 Module: `github-com-smartcontractkit-chainlink-common`

`pkg/workflows/ring` (2)

`pkg/workflows/ring/pb` (15)