Improve signal-to-noise ratio during deployment #1750

jordanstephens · 2026-01-06T19:49:43Z

Description

This PR aims to improve the signal-to-noise ratio during deployment by refurbishing the cli deployment "ui". Before this PR, the deployment "interface" is a live tail stream of logs from pulumi, build runners, and services. This PR changes the default behaviour during defang up. Instead of printing logs, we will print a list of compose services and live update their service statues. Logs are only printed when there is a failure—and we only print the logs associated with the failure. The existing interface can still be invoked by using defang compose up --tail.

Linked Issues

Fixes Hide logs during compose up unless there is a problem #1749

Checklist

I have performed a self-review of my code
I have added appropriate tests
I have updated the Defang CLI docs and/or README to reflect my changes, if necessary

Summary by CodeRabbit

New Features
- Interactive in-terminal UI for real-time deployment progress with spinners, coloring, and stable ordering.
- Streaming monitoring API with callback-driven progress updates.
Bug Fixes / Improvements
- Better coordination between live log tailing and state monitoring; clearer success signaling.
- Hidden/removed obsolete tail flag from the public CLI.
User-visible Changes
- Removed STATUS column from service listing outputs.
Chores
- Updated vendored dependency hashes and refreshed CLI dependencies.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-06T19:49:49Z

📝 Walkthrough

Walkthrough

Adds an interactive Bubble Tea terminal UI for compose deployments, refactors service-watching to a callback-driven WatchServiceState, introduces Monitor(...) and UI integration, updates dependencies and a Nix vendor hash, and adds ErrDeploymentSucceeded. No other public API changes.

Changes

Cohort / File(s)	Summary
UI-driven monitoring & compose flow `src/cmd/cli/command/compose.go`	Adds Bubble Tea UI types, rendering (spinners, colors), `monitorWithUI`, and switches compose up to use UI monitoring when not tailing; preserves tailing path for verbose/non-interactive runs.
Service watch & monitor refactor `src/pkg/cli/subscribe.go`, `src/pkg/cli/tailAndMonitor.go`	Introduces `WatchServiceState` (callback-driven), makes `WaitServiceState` a wrapper, adds exported `Monitor(...)` using `WatchServiceState`, and refactors `TailAndMonitor` to run monitor in a goroutine and aggregate monitor/tail errors.
Client errors `src/pkg/cli/client/errors.go`	Adds exported `ErrDeploymentSucceeded = errors.New("deployment succeeded")` for callback completion signaling.
Output formatting `src/pkg/cli/deploymentinfo.go`, `src/pkg/cli/deploymentinfo_test.go`, `src/pkg/cli/getServices_test.go`	Removes the STATUS column from printed service tables and updates tests expected outputs accordingly.
Tests & minor assertions `src/pkg/cli/subscribe_test.go`	Minor assertion message change in `TestWaitServiceState`.
Module deps `src/go.mod`	Adds Charmbracelet libs (`bubbletea`, `bubbles`, `lipgloss`) and related indirect/upgraded dependencies.
Nix packaging `pkgs/defang/cli.nix`	Updated `vendorHash` in `buildGo124Module` derivation (vendorHash changed).

Sequence Diagram

sequenceDiagram
    participant User
    participant CLI as CLI (compose.go)
    participant Monitor as Monitor()
    participant UI as Bubble Tea UI
    participant Provider as Provider (subscribe.go)

    User->>CLI: defang compose up (no --tail)
    activate CLI

    CLI->>Monitor: Monitor(ctx, project, provider, waitTimeout, deploymentID, cb)
    activate Monitor

    Monitor->>Provider: WatchServiceState(ctx, provider, ..., cb)
    activate Provider

    Monitor->>UI: start monitorWithUI (goroutine)
    activate UI

    Provider-->>Monitor: stream SubscribeResponse messages
    Monitor-->>UI: send serviceUpdate messages

    UI->>UI: render services (spinners, colors)
    User->>UI: view live updates

    alt deployment succeeds (callback returns ErrDeploymentSucceeded)
        Provider-->>Monitor: ErrDeploymentSucceeded
        Monitor-->>UI: exit signal
    else deployment fails or timeout
        Provider-->>Monitor: terminal error
        Monitor-->>UI: show error, exit
    end

    deactivate UI
    Monitor-->>CLI: final ServiceStates / error
    deactivate Monitor
    deactivate Provider
    CLI-->>User: print final status or tails if needed
    deactivate CLI

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Stack tools #1713: Overlaps on Monitor/WatchServiceState and TailAndMonitor refactoring; likely touches the same monitoring surface.
support logs --deployment=latest #1726: Modifies compose up flow and deployment selection in compose.go, touching similar UI/monitoring code paths.
Rename packages #1730: Related changes to CLI compose code and monitoring integrations (imports/types and UI-driven monitoring).

Suggested reviewers

lionello
raphaeltm
edwardrf

Poem

🐰
I hopped into the terminal bright,
Spinners twirled through deploy-night light,
Logs stay tucked until things weep,
Services dance while watchers keep —
Hop, deploy — then home to sleep.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	Title accurately summarizes the core change: improving signal-to-noise ratio by replacing verbose logs with concise service state updates during deployment.
Linked Issues check	✅ Passed	All code changes implement the requirements from issue #1749: interactive UI with service states, conditional log display on failures, verbose flag support, and tailing behavior preserved.
Out of Scope Changes check	✅ Passed	Changes remain focused on deployment UI improvements and supporting infrastructure; vendorHash update and minor test adjustments are all directly related to the core objective.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.5.0)

level=warning msg="[linters_context] running gomodguard failed: unable to read module file go.mod: current working directory must have a go.mod file: if you are not using go modules it is suggested to disable this linter"
level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies"

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @src/pkg/cli/subscribe.go:
- Around line 71-74: The goroutine that reads from serverStream can exit without
sending anything to errChan when serverStream.Receive() returns false and
serverStream.Err() is nil, which causes the main select loop to block; modify
the reader goroutine (the one calling serverStream.Receive() and
serverStream.Err()) to always signal completion by sending a sentinel error
(e.g., io.EOF or a package-level errStreamClosed) to errChan or by closing
errChan before returning, and update the main select loop that waits on errChan
to treat that sentinel/closed channel as stream completion and exit or continue
appropriately (handle errStreamClosed/io.EOF in the select case that currently
expects errors from errChan).

🧹 Nitpick comments (4)

src/cmd/cli/command/compose.go (2)
174-179: Silently swallowing UI errors may hide issues.

The error from p.Run() is discarded. While UI errors may not be critical, at minimum consider logging them for debugging purposes.
🔎 Proposed fix
 	go func() {
 		defer wg.Done()
 		if _, err := p.Run(); err != nil {
-			// Handle UI errors if needed
+			term.Debug("UI error:", err)
 		}
 	}()
203-215: Duplicate function: splitManagedAndUnmanagedServices exists in tailAndMonitor.go.

This function is duplicated from src/pkg/cli/tailAndMonitor.go (lines 134-145). Consider importing it from the cli package instead of duplicating it here.
🔎 Proposed fix
-func splitManagedAndUnmanagedServices(serviceInfos compose.Services) ([]string, []string) {
-	var managedServices []string
-	var unmanagedServices []string
-	for _, service := range serviceInfos {
-		if cli.CanMonitorService(&service) {
-			unmanagedServices = append(unmanagedServices, service.Name)
-		} else {
-			managedServices = append(managedServices, service.Name)
-		}
-	}
-
-	return managedServices, unmanagedServices
-}
Then use the function from the cli package directly where needed, or export splitManagedAndUnmanagedServices from the cli package if it isn't already.
src/pkg/cli/tailAndMonitor.go (2)
19-21: Panic for invalid input may be too aggressive for a library function.

Panicking on empty deploymentID will crash the program. If this is called with user-provided or dynamically-determined input, consider returning an error instead. If this is strictly an internal function where the caller guarantees validity, the panic is acceptable as a programming error detector.
🔎 Proposed fix to return an error instead
 func Monitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, deploymentID string, watchCallback func(*defangv1.SubscribeResponse, *ServiceStates) error) (ServiceStates, error) {
 	if deploymentID == "" {
-		panic("deploymentID must be a valid deployment ID")
+		return nil, errors.New("deploymentID must be a valid deployment ID")
 	}
53-56: Clarify the purpose of the 2-second sleep before returning.

The SleepWithContext call at line 54 delays the return by 2 seconds. There's also another 2-second sleep at line 79 in TailAndMonitor. If both are executed in the TailAndMonitor path, that's 4 seconds of delay. Consider adding a comment explaining why this delay is needed, or consolidate if both serve the same purpose.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 347a98f and 40fd11a.

⛔ Files ignored due to path filters (1)

src/go.sum is excluded by !**/*.sum

📒 Files selected for processing (7)

pkgs/defang/cli.nix
src/cmd/cli/command/compose.go
src/go.mod
src/pkg/cli/client/errors.go
src/pkg/cli/subscribe.go
src/pkg/cli/subscribe_test.go
src/pkg/cli/tailAndMonitor.go

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2025-12-31T13:47:12.225Z

Learnt from: lionello
Repo: DefangLabs/defang PR: 1740
File: src/pkg/cli/client/byoc/parse_test.go:18-21
Timestamp: 2025-12-31T13:47:12.225Z
Learning: In Go test files (any _test.go under the Defang codebase), it's acceptable for mocks to panic to surface issues quickly during tests. Do not add defensive error handling in mocks within tests, since panics will fail fast and highlight problems. Ensure this behavior is confined to test code and does not affect production code or non-test paths.

Applied to files:

src/pkg/cli/subscribe_test.go

📚 Learning: 2026-01-07T00:34:13.131Z

Learnt from: lionello
Repo: DefangLabs/defang PR: 1742
File: src/pkg/cli/composeDown.go:14-18
Timestamp: 2026-01-07T00:34:13.131Z
Learning: In Defang's Defang CLI, CdCommandDown performs refresh + destroy, while CdCommandDestroy performs destroy only (no refresh). Update ComposeDown (src/pkg/cli/composeDown.go) to call CdCommandDestroy to perform destruction without refreshing. This ensures the intended semantics are preserved when tearing down compositions; avoid using CdCommandDown in ComposeDown unless a refresh is explicitly desired. Verify that ComposeDown's destroy path does not trigger a refresh side effect from CdCommandDown and that tests cover both pathways if they exist.

Applied to files:

src/pkg/cli/subscribe_test.go
src/pkg/cli/client/errors.go
src/pkg/cli/tailAndMonitor.go
src/pkg/cli/subscribe.go
src/cmd/cli/command/compose.go

🧬 Code graph analysis (2)

src/pkg/cli/tailAndMonitor.go (6)

src/pkg/cli/client/provider.go (1)

Provider (54-80)

src/protos/io/defang/v1/fabric.pb.go (10)

Provider (28-28)

Provider (66-68)

Provider (70-72)

Provider (79-81)

SubscribeResponse (4202-4211)

SubscribeResponse (4224-4224)

SubscribeResponse (4239-4241)

Deployment (2822-2838)

Deployment (2851-2851)

Deployment (2866-2868)

src/pkg/cli/subscribe.go (2)

ServiceStates (15-15)

WatchServiceState (17-114)

src/pkg/cli/waitForCdTaskExit.go (1)

WaitForCdTaskExit (14-41)

src/pkg/utils.go (1)

SleepWithContext (116-125)

src/pkg/cli/tail.go (1)

TailOptions (45-58)

src/pkg/cli/subscribe.go (3)

src/protos/io/defang/v1/fabric.pb.go (15)

SubscribeResponse (4202-4211)

SubscribeResponse (4224-4224)

SubscribeResponse (4239-4241)

Status (3248-3253)

Status (3266-3266)

Status (3281-3283)

ServiceState (135-135)

ServiceState (199-201)

ServiceState (203-205)

ServiceState (212-214)

ServiceState_BUILD_FAILED (151-151)

ServiceState_DEPLOYMENT_FAILED (150-150)

Service (3835-3840)

Service (3853-3853)

Service (3868-3870)

src/pkg/types/etag.go (1)

ETag (9-9)

src/pkg/cli/client/errors.go (2)

ErrDeploymentSucceeded (8-8)

ErrDeploymentFailed (10-13)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: nix-shell-test
GitHub Check: go-playground-test
GitHub Check: Analyze (go)

🔇 Additional comments (9)

src/pkg/cli/client/errors.go (1)

3-8: LGTM! Sentinel error for deployment success signaling.

Using ErrDeploymentSucceeded as a sentinel to break out of the streaming loop is a clean pattern. The callback in WatchServiceState correctly converts this to a nil return, so it won't propagate as an actual error to callers.

pkgs/defang/cli.nix (1)

10-10: LGTM! Vendor hash updated for new dependencies.

The hash change is expected given the new Bubble Tea ecosystem dependencies added to go.mod.

src/pkg/cli/subscribe_test.go (1)

225-227: LGTM! Corrected test assertion message.

The previous message "Unexpected error" was misleading since this branch handles the case where err == nil but an error was expected. "Expected error but got nil" accurately describes the failure condition.

src/cmd/cli/command/compose.go (2)

374-374: Redundant error check after monitorWithUI.

Line 374 checks if err != nil but err is only set by the tail branch (line 368). For the monitorWithUI branch, the error is returned directly in serviceStates, err = monitorWithUI(...) at line 372, but err is checked again at line 374.

The issue is that err from line 372 won't be caught by line 374's condition because the code flow doesn't assign to err after monitorWithUI. Looking more carefully, line 372 does assign to err, so this should work, but the structure is confusing because line 368 returns early.

Actually, re-reading: line 368 has return err inside the if tail block, so line 374 only applies to the monitorWithUI path. This is correct but the flow is a bit hard to follow.

362-373: Verify behavior when monitorWithUI returns both states and error.

When monitorWithUI returns an error, the code continues to lines 374+ which handles deployment errors. However, if serviceStates is partially populated on error, the subsequent logic at lines 379-388 (checking for unbuilt services) depends on those states being correct. Ensure this is the intended behavior.

src/pkg/cli/tailAndMonitor.go (1)

65-66: Using context.Background() for tailCtx disconnects it from the parent context.

The tail context is derived from context.Background() rather than ctx, meaning parent cancellation (e.g., user pressing Ctrl+C) won't directly cancel tailing. Cancellation only happens via cancelTail(errMonitoringDone) when monitoring completes. This appears intentional to ensure logs continue until monitoring finishes, but verify this matches the expected UX when a user cancels mid-deployment.

src/pkg/cli/subscribe.go (2)

82-86: Good pattern: non-blocking send with context check.

The select with ctx.Done() prevents the goroutine from blocking forever if the main loop has exited due to context cancellation. This is the correct approach for channel sends in goroutines.

116-139: LGTM! Clean wrapper preserving the original API.

WaitServiceState provides a clean wrapper around WatchServiceState with a callback that implements the original target-state-based completion logic. The use of ErrDeploymentSucceeded as a sentinel to signal success is handled correctly by WatchServiceState (converted to nil return at line 108).

src/go.mod (1)

37-39: LGTM! Bubble Tea ecosystem dependencies for TUI.

The Charm Bracelet libraries (bubbles v0.21.0, bubbletea v1.3.10, lipgloss v1.1.0) are well-established for building terminal UIs in Go. These versions are current stable releases: lipgloss v1.1.0 is the latest available, while bubbles and bubbletea have stable v0.21.0 and v1.3.10 releases respectively (newer pre-release versions exist but are not yet stable).

The WatchServiceState function was blocking on serverStream.Receive() even when the context was cancelled by cancelSvcStatus. This change moves the stream receiving to a separate goroutine and uses channels with a select statement to properly handle context cancellation. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>

The test was flaky in CI due to a race condition in the select statement. When a stream ended after sending the final message (e.g., BUILD_FAILED), both msgChan and errChan could have data ready simultaneously: - msgChan had the failure state message - errChan had nil (from stream ending without error) The select statement could non-deterministically choose either channel, causing the test to sometimes return nil error instead of ErrDeploymentFailed. Fixed by: 1. Not sending nil errors to errChan when stream ends normally 2. Fixing misleading error message in test (cosmetic) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/pkg/cli/subscribe.go (1)
39-42: Critical: Data race on serverStream variable between goroutines.

The cleanup goroutine (lines 39-42) reads and calls serverStream.Close(), while the receiver goroutine (line 64) reassigns serverStream during reconnection. This creates a data race that could result in:

Closing the wrong stream (old vs. new)

Double-close panics

Accessing a closed stream

The receiver goroutine should manage its own stream lifecycle, or use synchronization (e.g., a mutex) to protect concurrent access.
🔒 Proposed fix using atomic value or local variable

Option 1: Keep stream local to receiver goroutine
-	serverStream, err := provider.Subscribe(ctx, &subscribeRequest)
-	if err != nil {
-		return nil, err
-	}
-
 	ctx, cancel := context.WithCancel(ctx)
 	defer cancel() // to ensure we close the stream and clean-up this context
 
-	go func() {
-		<-ctx.Done()
-		serverStream.Close()
-	}()
-
 	// ... 
 	
 	// Run stream receiving in a separate goroutine
 	go func() {
+		serverStream, err := provider.Subscribe(ctx, &subscribeRequest)
+		if err != nil {
+			errChan <- err
+			return
+		}
+		defer serverStream.Close()
+
 		for {
+			// Check context before blocking on Receive
+			select {
+			case <-ctx.Done():
+				return
+			default:
+			}
+
 			if !serverStream.Receive() {
 				// ... reconnection logic
 			}
 			// ...
 		}
 	}()
This eliminates the race by making the receiver goroutine own the stream lifecycle entirely.

Option 2: Use sync.Mutex to protect access

Add a mutex to protect concurrent access to serverStream if it must be shared.
Also applies to: 64-68

🤖 Fix all issues with AI agents

In @src/cmd/cli/command/compose.go:
- Around line 42-46: The services map on deploymentModel is accessed
concurrently; add a sync.RWMutex field to deploymentModel and use it to guard
all accesses to services (use Lock/Unlock when adding/removing/modifying entries
in Update and the monitoring goroutine that calls p.Send, and RLock/RUnlock when
reading/iterating the map in View). Ensure you acquire the mutex before any map
read/write, and when iterating in View prefer to either hold RLock for the
duration of the read or copy the needed keys/values under lock into a local
slice then release the lock before doing heavier UI work.
- Around line 203-215: The function splitManagedAndUnmanagedServices is
duplicated; extract a single implementation into the cli package and have both
call sites use it: move the logic into the existing cli package as an exported
function (e.g., SplitManagedAndUnmanagedServices or keep the same name but
exported if needed), preserving the signature (accepting compose.Services and
returning ([]string, []string)), update the places that currently define their
own copy (the command package and the tail/monitor code) to call
cli.SplitManagedAndUnmanagedServices, and remove the duplicate implementations;
ensure imports and visibility are adjusted so compose.Services is referenced
correctly from the new shared function.

In @src/pkg/cli/subscribe.go:
- Around line 71-74: The receiver goroutine can exit without signaling when
serverStream.Receive() returns false and serverStream.Err() is nil; modify the
goroutine that calls serverStream.Receive() to push a sentinel value (e.g., a
predefined error variable like errStreamClosed) onto errChan before returning
whenever serverStream.Err() == nil, and ensure the main select/loop that reads
from errChan (the loop handling responses and errors) treats errStreamClosed as
a clean/finished condition (stop waiting and exit gracefully) instead of
blocking; reference the receiver that calls serverStream.Receive(), the check
serverStream.Err(), and the errChan/errStreamClosed symbols so the change is
applied in the right places.

In @src/pkg/cli/tailAndMonitor.go:
- Around line 19-21: Replace the panic on invalid deploymentID with a proper
error return: change the check that currently does if deploymentID == "" {
panic(...) } to return an error (e.g., return fmt.Errorf("invalid deploymentID:
empty string")) from the enclosing function, update the function signature to
return (error) if needed, and propagate that error to callers so they can handle
it instead of crashing the process; reference the deploymentID validation block
in tailAndMonitor.go and update any callers to handle the returned error.
- Around line 61-63: The TailAndMonitor function currently calls
panic("tailOptions.Deployment must be a valid deployment ID") when
tailOptions.Deployment is empty; change this to return a proper error instead:
validate tailOptions.Deployment at the start of TailAndMonitor, and if empty
return a descriptive error (e.g., fmt.Errorf or errors.New) rather than
panicking so callers can handle it; update the function signature if necessary
to return an error and propagate that error from callers of TailAndMonitor
(search for TailAndMonitor and tailOptions.Deployment to locate sites to
update).

🧹 Nitpick comments (5)

src/pkg/cli/client/errors.go (1)
8-8: Document the unconventional use of an error for success.

Using an error to signal successful deployment completion is unconventional and may confuse maintainers. Consider adding a comment explaining that this sentinel error is used to signal completion in callback-based monitoring flows, distinguishing it from actual failure errors.
📝 Suggested documentation
+// ErrDeploymentSucceeded is a sentinel error returned by watch callbacks
+// to signal successful deployment completion (not an actual error condition).
 var ErrDeploymentSucceeded = errors.New("deployment succeeded")
src/cmd/cli/command/compose.go (3)
80-80: Close the update channel to prevent goroutine leaks.

The updateCh is never closed, which could lead to goroutines waiting on this channel to never terminate properly. While the Bubble Tea program's Quit will stop the UI, it's good practice to close channels when done sending.

Consider closing the channel in the monitoring goroutine after all updates are sent:
go func() {
	defer wg.Done()
	serviceStates, monitorErr = cli.Monitor(ctx, project, provider, waitTimeout, deploymentID, func(msg *defangv1.SubscribeResponse, states *cli.ServiceStates) error {
		// Send service status updates to the bubbletea model
		for name, state := range *states {
			p.Send(serviceUpdate{
				name:   name,
				status: state.String(),
			})
		}
		return nil
	})
	close(model.updateCh) // Close channel when done
	// Quit the UI when monitoring is done
	p.Quit()
}()
Note: Verify that no other goroutines are sending to this channel before closing it.

174-179: Handle UI errors instead of silently ignoring them.

The error returned by p.Run() is currently ignored. While UI errors might be rare, they should at least be logged for debugging purposes.
 	go func() {
 		defer wg.Done()
 		if _, err := p.Run(); err != nil {
-			// Handle UI errors if needed
+			term.Debug("Bubble Tea UI error:", err)
 		}
 	}()
474-474: Minor: Grammar improvement.

The message changed from "was previously deployed" to "has already deployed". While both are understandable, the grammar could be improved.
✏️ Suggested grammar fix
-	term.Info("This project has already deployed to the following locations:")
+	term.Info("This project has already been deployed to the following locations:")
src/pkg/cli/subscribe.go (1)
54-88: Consider explicit context check in receiver loop for clarity.

The receiver goroutine doesn't explicitly check ctx.Done() before calling the blocking serverStream.Receive(). While the cleanup goroutine (lines 39-42) closes the stream on context cancellation, adding an explicit check would make the cancellation path more evident:
for {
    select {
    case <-ctx.Done():
        return
    default:
    }
    
    if !serverStream.Receive() {
        // ...
    }
    // ...
}
This is especially helpful given that the stream reassignment race (identified separately) means the cleanup goroutine might not reliably close the current stream.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 40fd11a and e492b30.

⛔ Files ignored due to path filters (1)

src/go.sum is excluded by !**/*.sum

📒 Files selected for processing (7)

pkgs/defang/cli.nix
src/cmd/cli/command/compose.go
src/go.mod
src/pkg/cli/client/errors.go
src/pkg/cli/subscribe.go
src/pkg/cli/subscribe_test.go
src/pkg/cli/tailAndMonitor.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkgs/defang/cli.nix

🧰 Additional context used

🧠 Learnings (3)

📚 Learning: 2026-01-07T03:07:48.228Z

Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/byoc.go:448-450
Timestamp: 2026-01-07T03:07:48.228Z
Learning: In src/pkg/cli/client/byoc/gcp/byoc.go, the GetDeploymentStatus method intentionally does not pre-validate b.cdExecution before calling b.driver.GetBuildStatus. If b.cdExecution is empty, it represents an error state that will be surfaced by the GCP API as an "invalid operation name" error, which is the intended behavior.

Applied to files:

src/pkg/cli/client/errors.go
src/pkg/cli/subscribe_test.go

📚 Learning: 2026-01-07T00:34:13.131Z

Learnt from: lionello
Repo: DefangLabs/defang PR: 1742
File: src/pkg/cli/composeDown.go:14-18
Timestamp: 2026-01-07T00:34:13.131Z
Learning: In Defang's Defang CLI, CdCommandDown performs refresh + destroy, while CdCommandDestroy performs destroy only (no refresh). Update ComposeDown (src/pkg/cli/composeDown.go) to call CdCommandDestroy to perform destruction without refreshing. This ensures the intended semantics are preserved when tearing down compositions; avoid using CdCommandDown in ComposeDown unless a refresh is explicitly desired. Verify that ComposeDown's destroy path does not trigger a refresh side effect from CdCommandDown and that tests cover both pathways if they exist.

Applied to files:

src/pkg/cli/client/errors.go
src/pkg/cli/subscribe_test.go
src/pkg/cli/subscribe.go
src/cmd/cli/command/compose.go
src/pkg/cli/tailAndMonitor.go

📚 Learning: 2025-12-31T13:47:12.225Z

Learnt from: lionello
Repo: DefangLabs/defang PR: 1740
File: src/pkg/cli/client/byoc/parse_test.go:18-21
Timestamp: 2025-12-31T13:47:12.225Z
Learning: In Go test files (any _test.go under the Defang codebase), it's acceptable for mocks to panic to surface issues quickly during tests. Do not add defensive error handling in mocks within tests, since panics will fail fast and highlight problems. Ensure this behavior is confined to test code and does not affect production code or non-test paths.

Applied to files:

src/pkg/cli/subscribe_test.go

🧬 Code graph analysis (2)

src/pkg/cli/subscribe.go (4)

src/pkg/cli/client/provider.go (1)

Provider (54-80)

src/protos/io/defang/v1/fabric.pb.go (19)

Provider (28-28)

Provider (66-68)

Provider (70-72)

Provider (79-81)

SubscribeResponse (4286-4295)

SubscribeResponse (4308-4308)

SubscribeResponse (4323-4325)

Status (3332-3337)

Status (3350-3350)

Status (3365-3367)

ServiceState (135-135)

ServiceState (199-201)

ServiceState (203-205)

ServiceState (212-214)

ServiceState_BUILD_FAILED (151-151)

ServiceState_DEPLOYMENT_FAILED (150-150)

Service (3919-3924)

Service (3937-3937)

Service (3952-3954)

src/pkg/types/etag.go (1)

ETag (9-9)

src/pkg/cli/client/errors.go (2)

ErrDeploymentSucceeded (8-8)

ErrDeploymentFailed (10-13)

src/pkg/cli/tailAndMonitor.go (5)

src/pkg/cli/client/provider.go (1)

Provider (54-80)

src/pkg/cli/subscribe.go (2)

ServiceStates (15-15)

WatchServiceState (17-114)

src/pkg/cli/waitForCdTaskExit.go (1)

WaitForCdTaskExit (14-41)

src/pkg/utils.go (1)

SleepWithContext (116-125)

src/pkg/cli/tail.go (1)

TailOptions (45-58)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Analyze (go)
GitHub Check: go-test

🔇 Additional comments (7)

src/pkg/cli/subscribe_test.go (1)

226-226: LGTM! Clearer error message.

The updated message accurately describes the failure condition when the test expects an error but receives nil, improving test readability.

src/cmd/cli/command/compose.go (2)

80-80: Consider whether channel buffer size is sufficient.

The updateCh has a buffer of 100 messages. For deployments with many services or rapid state transitions, this buffer could fill up and cause the monitoring goroutine to block when sending updates via p.Send.

Can you estimate the maximum number of state updates expected during a typical deployment? Each service goes through multiple states (BUILD_QUEUED, BUILD_PROVISIONING, BUILD_RUNNING, etc.), so for N services, you might see 5-10 updates per service. With 100 services, that could exceed the buffer.

Consider either:

Increasing the buffer size significantly (e.g., 1000)

Using an unbuffered channel (relying on Bubble Tea's internal buffering)

Making the channel buffering configurable based on the number of services

364-373: LGTM! Clean separation of tail vs. UI monitoring paths.

The conditional logic appropriately preserves the tail behavior when --tail is specified while introducing the new UI-based monitoring as the default. The error handling for both paths is consistent.

src/pkg/cli/tailAndMonitor.go (2)

18-57: LGTM! Well-structured monitoring function.

The Monitor function properly:

Validates inputs (though should return error instead of panic)

Applies timeout when specified

Runs monitoring and CD task checking concurrently

Aggregates errors from both sources

Provides callback mechanism for state updates

The use of sync.WaitGroup and proper context cancellation ensures clean goroutine lifecycle management.

59-118: Verify the 2-second sleep is necessary.

Lines 54 and 79 include a 2-second sleep before cancellation. While the comment at line 79 explains it's to "make sure we get last status messages," this seems like a brittle approach that relies on timing rather than proper synchronization.

Consider whether there's a more reliable way to ensure all status messages are received before cancellation, such as:

Waiting for a completion signal from the subscription

Using a done channel to signal message processing is complete

Flushing any buffered messages

The hardcoded 2-second delay may be insufficient under heavy load or excessive when messages arrive quickly.

src/go.mod (1)

37-39: Bubble Tea library versions are valid and secure.

The added dependencies (bubbles v0.21.0, bubbletea v1.3.10, lipgloss v1.1.0, termenv v0.16.0) are the latest stable releases as of January 2026 with no known published vulnerabilities or CVEs.

src/pkg/cli/subscribe.go (1)

77-80: The nil check after Receive() is necessary and justified.

The serverStream.Msg() call can legitimately return nil even after Receive() returns true, as evidenced by the test implementation in subscribe_test.go where MockSubscribeServerStreamForReconnectTest.Msg() explicitly returns nil. This defensive check is appropriate for robust stream handling. Consider adding a comment explaining this edge case for future maintainers.

coderabbitai · 2026-01-08T00:28:18Z

src/cmd/cli/command/compose.go

+func splitManagedAndUnmanagedServices(serviceInfos compose.Services) ([]string, []string) {
+	var managedServices []string
+	var unmanagedServices []string
+	for _, service := range serviceInfos {
+		if cli.CanMonitorService(&service) {
+			unmanagedServices = append(unmanagedServices, service.Name)
+		} else {
+			managedServices = append(managedServices, service.Name)
+		}
+	}
+
+	return managedServices, unmanagedServices
+}


🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg -n "func splitManagedAndUnmanagedServices" --type=go

Repository: DefangLabs/defang

Length of output: 313

🏁 Script executed:

# Read the function from tailAndMonitor.go echo "=== tailAndMonitor.go ===" && sed -n '134,146p' src/pkg/cli/tailAndMonitor.go # Read the function from compose.go echo "" && echo "=== compose.go ===" && sed -n '203,215p' src/cmd/cli/command/compose.go

Repository: DefangLabs/defang

Length of output: 940

🏁 Script executed:

# Check package declarations echo "=== tailAndMonitor.go package ===" && head -20 src/pkg/cli/tailAndMonitor.go | grep -E "^package" echo "=== compose.go package ===" && head -20 src/cmd/cli/command/compose.go | grep -E "^package" # Check imports to understand CanMonitorService references echo "" && echo "=== tailAndMonitor.go imports ===" && sed -n '1,30p' src/pkg/cli/tailAndMonitor.go | grep -A 20 "^import" echo "" && echo "=== compose.go imports ===" && sed -n '1,30p' src/cmd/cli/command/compose.go | grep -A 20 "^import"

Repository: DefangLabs/defang

Length of output: 1448

Extract duplicate function to shared location to avoid code duplication.

The splitManagedAndUnmanagedServices function is duplicated in two locations with identical logic:

src/pkg/cli/tailAndMonitor.go (line 134, package cli)

src/cmd/cli/command/compose.go (line 203, package command)

Since compose.go already imports the cli package, this function should be moved to src/pkg/cli to eliminate the duplication and follow the DRY principle.

🤖 Prompt for AI Agents

In @src/cmd/cli/command/compose.go around lines 203 - 215, The function splitManagedAndUnmanagedServices is duplicated; extract a single implementation into the cli package and have both call sites use it: move the logic into the existing cli package as an exported function (e.g., SplitManagedAndUnmanagedServices or keep the same name but exported if needed), preserving the signature (accepting compose.Services and returning ([]string, []string)), update the places that currently define their own copy (the command package and the tail/monitor code) to call cli.SplitManagedAndUnmanagedServices, and remove the duplicate implementations; ensure imports and visibility are adjusted so compose.Services is referenced correctly from the new shared function.

coderabbitai · 2026-01-08T00:28:18Z

src/pkg/cli/tailAndMonitor.go

+	if deploymentID == "" {
+		panic("deploymentID must be a valid deployment ID")
 	}


⚠️ Potential issue | 🟠 Major

Replace panic with error return for production code.

Using panic for input validation in production code can crash the entire application. This should return an error instead, allowing callers to handle the validation failure gracefully.

🛡️ Proposed fix

-func Monitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, deploymentID string, watchCallback func(*defangv1.SubscribeResponse, *ServiceStates) error) (ServiceStates, error) { +func Monitor(ctx context.Context, project *compose.Project, provider client.Provider, waitTimeout time.Duration, deploymentID string, watchCallback func(*defangv1.SubscribeResponse, *ServiceStates) error) (ServiceStates, error) { if deploymentID == "" { - panic("deploymentID must be a valid deployment ID") + return nil, errors.New("deploymentID must be a valid deployment ID") }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if deploymentID == "" {

panic("deploymentID must be a valid deployment ID")

}

if deploymentID == "" {

return nil, errors.New("deploymentID must be a valid deployment ID")

}

🤖 Prompt for AI Agents

In @src/pkg/cli/tailAndMonitor.go around lines 19 - 21, Replace the panic on invalid deploymentID with a proper error return: change the check that currently does if deploymentID == "" { panic(...) } to return an error (e.g., return fmt.Errorf("invalid deploymentID: empty string")) from the enclosing function, update the function signature to return (error) if needed, and propagate that error to callers so they can handle it instead of crashing the process; reference the deploymentID validation block in tailAndMonitor.go and update any callers to handle the returned error.

coderabbitai · 2026-01-08T00:28:18Z

src/pkg/cli/tailAndMonitor.go

+	if tailOptions.Deployment == "" {
+		panic("tailOptions.Deployment must be a valid deployment ID")
+	}


⚠️ Potential issue | 🟠 Major

Replace panic with error return for production code.

Similar to the issue in the Monitor function, using panic for input validation in TailAndMonitor can crash the application. Return an error instead.

🛡️ Proposed fix

tailOptions.Follow = true if tailOptions.Deployment == "" { - panic("tailOptions.Deployment must be a valid deployment ID") + return nil, errors.New("tailOptions.Deployment must be a valid deployment ID") }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In @src/pkg/cli/tailAndMonitor.go around lines 61 - 63, The TailAndMonitor function currently calls panic("tailOptions.Deployment must be a valid deployment ID") when tailOptions.Deployment is empty; change this to return a proper error instead: validate tailOptions.Deployment at the start of TailAndMonitor, and if empty return a descriptive error (e.g., fmt.Errorf or errors.New) rather than panicking so callers can handle it; update the function signature if necessary to return an error and propagate that error from callers of TailAndMonitor (search for TailAndMonitor and tailOptions.Deployment to locate sites to update).

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In @src/cmd/cli/command/compose.go:
- Around line 42-46: Add a sync.RWMutex field to the deploymentModel struct to
protect concurrent access to the services map, then wrap all writes to the map
(in the Update method) with mutex.Lock() / mutex.Unlock() and wrap all reads of
the map (in the View method) with mutex.RLock() / mutex.RUnlock(); ensure you
reference the deploymentModel.services accesses in Update and View and hold the
lock only for the minimal time needed to modify or read the map to avoid
blocking other goroutines.
- Around line 378-389: The loop that collects unbuiltServices currently checks
serviceStates against defangv1.ServiceState_BUILD_STOPPING and thus misses
services in defangv1.ServiceState_BUILD_FAILED; update the condition in the for
loop that populates unbuiltServices (the variable serviceStates and the check
using defangv1.ServiceState_BUILD_STOPPING) to also include
defangv1.ServiceState_BUILD_FAILED (e.g., change the if to include state ==
defangv1.ServiceState_BUILD_FAILED or an equivalent check) so
tailOptions.Services and tailOptions.LogType = logs.LogTypeBuild correctly
include failed builds.

🧹 Nitpick comments (1)

src/cmd/cli/command/compose.go (1)
157-202: Consider logging UI errors for diagnostics.

The UI goroutine silently discards errors from p.Run() (line 177). While these errors may not be critical to deployment success, logging them could aid debugging.
🔍 Optional improvement to log UI errors
 	// Start the bubbletea UI in a goroutine
 	go func() {
 		defer wg.Done()
 		if _, err := p.Run(); err != nil {
-			// Handle UI errors if needed
+			term.Debugf("UI error: %v", err)
 		}
 	}()

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e492b30 and 52e8045.

📒 Files selected for processing (1)

src/cmd/cli/command/compose.go

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2026-01-07T00:34:13.131Z

Learnt from: lionello
Repo: DefangLabs/defang PR: 1742
File: src/pkg/cli/composeDown.go:14-18
Timestamp: 2026-01-07T00:34:13.131Z
Learning: In Defang's Defang CLI, CdCommandDown performs refresh + destroy, while CdCommandDestroy performs destroy only (no refresh). Update ComposeDown (src/pkg/cli/composeDown.go) to call CdCommandDestroy to perform destruction without refreshing. This ensures the intended semantics are preserved when tearing down compositions; avoid using CdCommandDown in ComposeDown unless a refresh is explicitly desired. Verify that ComposeDown's destroy path does not trigger a refresh side effect from CdCommandDown and that tests cover both pathways if they exist.

Applied to files:

src/cmd/cli/command/compose.go

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Analyze (go)

🔇 Additional comments (8)

src/cmd/cli/command/compose.go (8)

10-12: LGTM: Imports align with new UI requirements.

The added imports (sort, sync, Bubble Tea libraries) are appropriate for the terminal UI and concurrent monitoring features introduced in this PR.

Also applies to: 33-35

64-82: LGTM: Constructor properly initializes UI model.

The initialization logic correctly sets up the services map, spinners, and update channel. The buffer size of 100 for updateCh provides reasonable headroom for status updates.

84-90: LGTM: Standard Bubble Tea initialization pattern.

The Init method correctly batches spinner tick commands for all services.

92-114: LGTM: Correct Bubble Tea update logic.

The Update method properly handles keyboard input, service status updates, and spinner ticks. The existence check on line 100 prevents panics from unknown service names.

116-155: LGTM: Clean UI rendering with stable ordering.

The View method correctly sorts services for consistent display and uses appropriate visual indicators (✓, ✗, spinner) based on service state.

356-377: LGTM: Clean separation between tail and UI monitoring paths.

The branching logic correctly preserves the full log tailing behavior behind the --tail flag while defaulting to the new UI-based monitoring. The info message on line 372 helpfully guides users to access logs when needed.

475-475: LGTM: Text improvement enhances clarity.

The updated message is more concise and natural.

204-216: Extract duplicate function to avoid code duplication.

This function is duplicated in src/pkg/cli/tailAndMonitor.go (line 134). Move it to a shared location in the cli package to follow the DRY principle, as the command package already imports cli.

Likely an incorrect or invalid review comment.

coderabbitai · 2026-01-08T01:18:26Z

src/cmd/cli/command/compose.go

+type deploymentModel struct {
+	services map[string]*serviceState
+	quitting bool
+	updateCh chan serviceUpdate
+}


⚠️ Potential issue | 🔴 Critical

Add mutex to protect concurrent map access.

The services map is accessed concurrently by the UI goroutine (in View at lines 124-151) and the monitoring goroutine (via p.Send at line 188). Without synchronization, this causes race conditions and potential panics.

Add a sync.RWMutex field to deploymentModel and protect all map accesses:

Use Lock/Unlock when modifying the map in Update (lines 100-101)

Use RLock/RUnlock when reading the map in View (lines 124-151)

🔒 Proposed fix to add mutex protection

type deploymentModel struct { services map[string]*serviceState + mu sync.RWMutex quitting bool updateCh chan serviceUpdate }

Then protect access in the Update method around line 100:

case serviceUpdate: + m.mu.Lock() if svc, exists := m.services[msg.name]; exists { svc.status = msg.status } + m.mu.Unlock() return m, nil

And in the View method around line 122:

func (m *deploymentModel) View() string { if m.quitting { return "" } + m.mu.RLock() + defer m.mu.RUnlock() + var lines []string // Sort services by name for consistent ordering var serviceNames []string

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In @src/cmd/cli/command/compose.go around lines 42 - 46, Add a sync.RWMutex field to the deploymentModel struct to protect concurrent access to the services map, then wrap all writes to the map (in the Update method) with mutex.Lock() / mutex.Unlock() and wrap all reads of the map (in the View method) with mutex.RLock() / mutex.RUnlock(); ensure you reference the deploymentModel.services accesses in Update and View and hold the lock only for the minimal time needed to modify or read the map to avoid blocking other goroutines.

coderabbitai · 2026-01-08T01:18:27Z

src/cmd/cli/command/compose.go

+				// if any services failed to build, only show build logs for those
+				// services
+				var unbuiltServices = make([]string, 0, len(project.Services))
+				for service, state := range serviceStates {
+					if state <= defangv1.ServiceState_BUILD_STOPPING {
+						unbuiltServices = append(unbuiltServices, service)
+					}
+				}
+				if len(unbuiltServices) > 0 {
+					tailOptions.LogType = logs.LogTypeBuild
+					tailOptions.Services = unbuiltServices
+				}


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for ServiceState enum definition in the codebase # First, find files that might contain the ServiceState enum fd -e go -e proto | head -20 # Search for ServiceState enum definition rg "enum ServiceState|type ServiceState" --type=go --type=proto -A 20 # Also search for BUILD_STOPPING specifically rg "BUILD_STOPPING|ServiceState_" --type=go --type=proto | head -30

Repository: DefangLabs/defang

Length of output: 800

🏁 Script executed:

#!/bin/bash # Search more broadly for ServiceState and BUILD_STOPPING # Look for ServiceState definition anywhere in codebase rg "ServiceState_" -A 1 -B 1 | head -50 # Check for enum definitions or const declarations rg "BUILD_STOPPING|BUILD_RUNNING|BUILD_SUCCEEDED" -n | head -30 # Search in specific directories that might have API definitions find . -type f \( -name "*.pb.go" -o -name "*v1*.go" \) | xargs rg "ServiceState" -l 2>/dev/null | head -10

Repository: DefangLabs/defang

Length of output: 6426

🏁 Script executed:

#!/bin/bash # Examine how BUILD_FAILED and other states are used in error handling contexts # Check if BUILD_FAILED is referenced elsewhere in compose.go or related files rg "BUILD_FAILED" src/cmd/cli/command/ -n # Look at the complete monitoring/error handling section sed -n '375,410p' src/cmd/cli/command/compose.go

Repository: DefangLabs/defang

Length of output: 1287

Include BUILD_FAILED state when identifying services to show build logs for.

The condition state <= defangv1.ServiceState_BUILD_STOPPING (value 6) only captures in-progress build states and excludes BUILD_FAILED (value 11). When deployment fails, services that have already failed during the build phase should also be included to properly surface build logs. Update the logic to include BUILD_FAILED state in the unbuiltServices collection.

🤖 Prompt for AI Agents

In @src/cmd/cli/command/compose.go around lines 378 - 389, The loop that collects unbuiltServices currently checks serviceStates against defangv1.ServiceState_BUILD_STOPPING and thus misses services in defangv1.ServiceState_BUILD_FAILED; update the condition in the for loop that populates unbuiltServices (the variable serviceStates and the check using defangv1.ServiceState_BUILD_STOPPING) to also include defangv1.ServiceState_BUILD_FAILED (e.g., change the if to include state == defangv1.ServiceState_BUILD_FAILED or an equivalent check) so tailOptions.Services and tailOptions.LogType = logs.LogTypeBuild correctly include failed builds.

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In @src/cmd/cli/command/compose.go:
- Around line 368-372: The loop that collects unbuiltServices currently checks
if state <= defangv1.ServiceState_BUILD_STOPPING which misses
defangv1.ServiceState_BUILD_FAILED; update the condition in the serviceStates
iteration (the block that appends to unbuiltServices) to also include
BUILD_FAILED (e.g., state <= defangv1.ServiceState_BUILD_STOPPING || state ==
defangv1.ServiceState_BUILD_FAILED) so services that failed during build are
captured for build log output.

🧹 Nitpick comments (1)

src/cmd/cli/command/compose.go (1)
177-182: Consider logging UI errors for debugging.

UI errors are silently ignored. While this may be intentional to not disrupt the user experience, consider logging them at debug level for troubleshooting.
 	go func() {
 		defer wg.Done()
 		if _, err := p.Run(); err != nil {
-			// Handle UI errors if needed
+			term.Debugf("UI error: %v", err)
 		}
 	}()

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 52e8045 and c8b40fd.

📒 Files selected for processing (5)

src/cmd/cli/command/compose.go
src/pkg/cli/deploymentinfo.go
src/pkg/cli/deploymentinfo_test.go
src/pkg/cli/getServices_test.go
src/pkg/cli/subscribe.go

🧰 Additional context used

🧠 Learnings (4)

📚 Learning: 2026-01-07T00:34:13.131Z

Learnt from: lionello
Repo: DefangLabs/defang PR: 1742
File: src/pkg/cli/composeDown.go:14-18
Timestamp: 2026-01-07T00:34:13.131Z
Learning: In Defang's Defang CLI, CdCommandDown performs refresh + destroy, while CdCommandDestroy performs destroy only (no refresh). Update ComposeDown (src/pkg/cli/composeDown.go) to call CdCommandDestroy to perform destruction without refreshing. This ensures the intended semantics are preserved when tearing down compositions; avoid using CdCommandDown in ComposeDown unless a refresh is explicitly desired. Verify that ComposeDown's destroy path does not trigger a refresh side effect from CdCommandDown and that tests cover both pathways if they exist.

Applied to files:

src/pkg/cli/deploymentinfo.go
src/pkg/cli/getServices_test.go
src/pkg/cli/deploymentinfo_test.go
src/cmd/cli/command/compose.go
src/pkg/cli/subscribe.go

📚 Learning: 2025-12-31T13:47:12.225Z

Learnt from: lionello
Repo: DefangLabs/defang PR: 1740
File: src/pkg/cli/client/byoc/parse_test.go:18-21
Timestamp: 2025-12-31T13:47:12.225Z
Learning: In Go test files (any _test.go under the Defang codebase), it's acceptable for mocks to panic to surface issues quickly during tests. Do not add defensive error handling in mocks within tests, since panics will fail fast and highlight problems. Ensure this behavior is confined to test code and does not affect production code or non-test paths.

Applied to files:

src/pkg/cli/getServices_test.go
src/pkg/cli/deploymentinfo_test.go

📚 Learning: 2026-01-07T03:07:48.228Z

Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/cli/client/byoc/gcp/byoc.go:448-450
Timestamp: 2026-01-07T03:07:48.228Z
Learning: In src/pkg/cli/client/byoc/gcp/byoc.go, the GetDeploymentStatus method intentionally does not pre-validate b.cdExecution before calling b.driver.GetBuildStatus. If b.cdExecution is empty, it represents an error state that will be surfaced by the GCP API as an "invalid operation name" error, which is the intended behavior.

Applied to files:

src/pkg/cli/deploymentinfo_test.go
src/cmd/cli/command/compose.go

📚 Learning: 2026-01-07T17:31:18.174Z

Learnt from: edwardrf
Repo: DefangLabs/defang PR: 1747
File: src/pkg/clouds/gcp/cloudbuild.go:185-204
Timestamp: 2026-01-07T17:31:18.174Z
Learning: In src/pkg/clouds/gcp/cloudbuild.go, the GetBuildStatus method is intentionally designed as a non-blocking status check. It uses op.Poll(ctx) rather than op.Wait(ctx), and returns nil when the build is nil (operation still in progress), allowing the caller to control the polling loop. The method should not be changed to use Wait() as the waiting logic is handled by the caller.

Applied to files:

src/pkg/cli/subscribe.go

🧬 Code graph analysis (1)

src/pkg/cli/subscribe.go (3)

src/pkg/cli/client/provider.go (1)

Provider (54-80)

src/protos/io/defang/v1/fabric.pb.go (20)

Provider (28-28)

Provider (66-68)

Provider (70-72)

Provider (79-81)

SubscribeResponse (4286-4295)

SubscribeResponse (4308-4308)

SubscribeResponse (4323-4325)

Status (3332-3337)

Status (3350-3350)

Status (3365-3367)

ServiceState_NOT_SPECIFIED (138-138)

ServiceState (135-135)

ServiceState (199-201)

ServiceState (203-205)

ServiceState (212-214)

ServiceState_BUILD_FAILED (151-151)

ServiceState_DEPLOYMENT_FAILED (150-150)

Service (3919-3924)

Service (3937-3937)

Service (3952-3954)

src/pkg/cli/client/errors.go (2)

ErrDeploymentSucceeded (8-8)

ErrDeploymentFailed (10-13)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: nix-shell-test
GitHub Check: Analyze (go)

🔇 Additional comments (7)

src/pkg/cli/deploymentinfo.go (1)

61-61: LGTM - STATUS column removal aligns with PR objectives.

This change correctly removes the STATUS column from the service table output, keeping only the canonical "State" as specified in the PR objectives to improve signal-to-noise ratio.

src/cmd/cli/command/compose.go (3)

42-46: Race condition concern addressed by Bubble Tea's design.

The services map is accessed by Update and View methods which are called sequentially on Bubble Tea's main goroutine. The p.Send call from the monitoring goroutine is thread-safe as it uses an internal channel. No mutex is needed here.

64-155: Well-structured Bubble Tea UI implementation.

The model correctly handles:

Initialization with spinner ticks for all services

State updates via messages

Consistent rendering with sorted service names

Terminal state indicators (✓/✗) replacing spinners

404-406: Reasonable error handling for PrintServices.

Downgrading the error to a warning here is appropriate since the deployment itself succeeded and this is just a display issue.

src/pkg/cli/subscribe.go (1)

118-141: Clean refactoring to callback-based approach.

The WaitServiceState wrapper correctly delegates to WatchServiceState with a callback that handles terminal states. Using ErrDeploymentSucceeded as a sentinel for successful completion is an idiomatic pattern.

src/pkg/cli/getServices_test.go (1)

89-91: Test expectations correctly updated.

The expected output aligns with the production change that removes the STATUS column from the short form table output while preserving status information in the long form YAML output.

src/pkg/cli/deploymentinfo_test.go (1)

40-41: Test expectations correctly aligned with STATUS column removal.

All three test cases have been updated to reflect the removal of the STATUS column from the table output, matching the production change in deploymentinfo.go.

Also applies to: 61-62, 80-81

coderabbitai · 2026-01-08T02:04:47Z

src/cmd/cli/command/compose.go

+				for service, state := range serviceStates {
+					if state <= defangv1.ServiceState_BUILD_STOPPING {
+						unbuiltServices = append(unbuiltServices, service)
+					}
+				}


⚠️ Potential issue | 🟠 Major

Include BUILD_FAILED state when identifying services for build log output.

The condition state <= defangv1.ServiceState_BUILD_STOPPING (value 6) excludes BUILD_FAILED (value 11). Services that failed during the build phase should also be included to properly surface build logs on deployment failure.

🐛 Proposed fix

var unbuiltServices = make([]string, 0, len(project.Services)) for service, state := range serviceStates { - if state <= defangv1.ServiceState_BUILD_STOPPING { + if state <= defangv1.ServiceState_BUILD_STOPPING || state == defangv1.ServiceState_BUILD_FAILED { unbuiltServices = append(unbuiltServices, service) } }

🤖 Prompt for AI Agents

In @src/cmd/cli/command/compose.go around lines 368 - 372, The loop that collects unbuiltServices currently checks if state <= defangv1.ServiceState_BUILD_STOPPING which misses defangv1.ServiceState_BUILD_FAILED; update the condition in the serviceStates iteration (the block that appends to unbuiltServices) to also include BUILD_FAILED (e.g., state <= defangv1.ServiceState_BUILD_STOPPING || state == defangv1.ServiceState_BUILD_FAILED) so services that failed during build are captured for build log output.

coderabbitai · 2026-01-08T02:04:47Z

src/pkg/cli/subscribe.go

+				if err := serverStream.Err(); err != nil {
+					errChan <- err
 				}
+				return


⚠️ Potential issue | 🔴 Critical

Goroutine can exit without signaling, causing indefinite hang.

When serverStream.Receive() returns false and serverStream.Err() returns nil (clean stream close), the goroutine exits without sending anything to errChan. The main loop at line 90 would block indefinitely waiting for a message that will never arrive.

🔒 Proposed fix

+var errStreamClosed = errors.New("stream closed") + // inside the goroutine: if err := serverStream.Err(); err != nil { errChan <- err +} else { + errChan <- errStreamClosed // Signal clean stream close } return

Then handle errStreamClosed in the main loop, treating it as unexpected closure or retry condition.

Committable suggestion skipped: line range outside the PR's diff.

jordanstephens force-pushed the jordan/deployment-status-updates branch from d17d7ad to 2fa03a8 Compare January 6, 2026 21:04

jordanstephens marked this pull request as ready for review January 7, 2026 01:15

jordanstephens requested a review from lionello as a code owner January 7, 2026 01:15

coderabbitai bot reviewed Jan 7, 2026

View reviewed changes

jordanstephens and others added 12 commits January 7, 2026 16:22

refactor printing existing deployments

633412f

factor out WatchServiceState

2b38028

factor out Monitor from TailAndMonitor

6bfa45f

call Monitor after ComposeUp

250cfc1

print logs on deployment failures

52714f0

print targeted build logs after failure

96b5162

preserve the ability to tail live

28e26e8

start with DEPLOYMENT_QUEUED

827a5d5

print live service status w/ bubbletea

50ee1eb

update nix hash

4aebed3

jordanstephens force-pushed the jordan/deployment-status-updates branch from 40fd11a to e492b30 Compare January 8, 2026 00:23

coderabbitai bot reviewed Jan 8, 2026

View reviewed changes

jordanstephens added 2 commits January 7, 2026 16:54

refactor

3d701a5

use adaptive colors for light and dark backgrounds

52e8045

coderabbitai bot reviewed Jan 8, 2026

View reviewed changes

jordanstephens added 5 commits January 7, 2026 17:18

avoid new --tail flag, use --verbose

dfd5b74

include managed services

39e7fd1

avoid printing unspecified service states

e018627

check healthchecks when summarizing deployments

28a6697

avoid printing service "Status", only print "State"

c8b40fd

coderabbitai bot reviewed Jan 8, 2026

View reviewed changes

Improve signal-to-noise ratio during deployment #1750

Are you sure you want to change the base?

Improve signal-to-noise ratio during deployment #1750

Uh oh!

Conversation

jordanstephens commented Jan 6, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Linked Issues

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jordanstephens commented Jan 6, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 6, 2026 •

edited

Loading