Skip to content

Add cluster-host command for provisioning Incus containers with NixOS#37

Open
rubionic wants to merge 12 commits intomainfrom
opencode-issue-252-1767819617
Open

Add cluster-host command for provisioning Incus containers with NixOS#37
rubionic wants to merge 12 commits intomainfrom
opencode-issue-252-1767819617

Conversation

@rubionic
Copy link
Collaborator

@rubionic rubionic commented Jan 7, 2026

Summary

Implements #36

Adds a new deskrun cluster-host command to provision NixOS containers on Incus hosts, pre-configured with Docker, Kind, and deskrun for running GitHub Actions runners on remote infrastructure.

Changes

New Types and Configuration

  • Added ClusterHost type to pkg/types/types.go with fields: Name, Image, DiskSize, CreatedAt
  • Extended Config struct in internal/config/config.go with ClusterHosts map
  • Added config management methods: AddClusterHost, RemoveClusterHost, GetClusterHost

Incus Package (new)

  • internal/incus/incus.go: Incus CLI wrapper with methods for:
    • Container lifecycle: CreateContainer, DeleteContainer, ContainerExists, ListContainers
    • Container operations: PushContent, Exec, WaitForRunning
  • internal/incus/nixos.go: NixOS configuration helper with ConfigureNixOS method
  • internal/incus/templates/deskrun.nix: Embedded NixOS module that installs:
    • Docker daemon with auto-pruning
    • Kind and kubectl
    • deskrun wrapper via Nix flakes
    • Increased kernel inotify limits for Docker/Kind

Commands (new)

  • internal/cmd/cluster_host.go: Cobra command implementation
    • deskrun cluster-host create - Create new cluster host with configurable name, disk, and image
    • deskrun cluster-host delete - Delete cluster host and clean up
    • deskrun cluster-host list - List all cluster hosts with status
    • deskrun cluster-host configure - Re-apply NixOS configuration

Documentation

  • Added comprehensive "Remote Cluster Hosts (Incus)" section to README.md
  • Documented prerequisites, usage examples, and management commands
  • Explained remote selection with Incus

Development Tools

  • Added fmt script to devbox.json for consistent code formatting

Testing

  • ✅ All existing tests pass
  • ✅ Code builds successfully
  • ✅ Manual testing of command help and structure
  • ✅ Code formatted with go fmt

Usage Example

# Create a cluster host
deskrun cluster-host create --name my-host --disk 300GiB

# Access and use it
incus exec my-host -- deskrun add my-runner \
  --repository https://github.com/owner/repo \
  --mode cached-privileged-kubernetes \
  --auth-type pat \
  --auth-value ghp_xxx

# List cluster hosts
deskrun cluster-host list

# Clean up
deskrun cluster-host delete my-host

Implementation Notes

  • Uses Incus CLI (not LXD) for container management
  • Container security.nesting=true enables Docker-in-container
  • NixOS configuration is idempotent and can be re-applied
  • Works with current Incus remote (users switch with incus remote switch)
  • Configuration persisted to ~/.deskrun/config.json

Closes

Closes #36

… NixOS

Implements #36

This adds a new `deskrun cluster-host` command to provision NixOS containers
on Incus hosts, pre-configured with Docker, Kind, and deskrun for running
GitHub Actions runners on remote infrastructure.

Key features:
- Create NixOS containers on Incus with auto-generated or custom names
- Configurable disk size (default 200GiB) and NixOS image version
- Automatic installation of Docker, Kind, kubectl, and deskrun via NixOS modules
- Support for listing, configuring, and deleting cluster hosts
- Configuration persisted to ~/.deskrun/config.json
- Works with current Incus remote (use 'incus remote switch' to change)

Implementation details:
- New ClusterHost type in pkg/types/types.go
- Config management methods in internal/config/config.go
- Incus CLI wrapper in internal/incus/incus.go
- NixOS configuration helper in internal/incus/nixos.go
- Embedded NixOS template in internal/incus/templates/deskrun.nix
- Cobra commands in internal/cmd/cluster_host.go
- Documentation added to README.md
- Added fmt script to devbox.json for code formatting

Commands:
- deskrun cluster-host create [--name] [--disk] [--image]
- deskrun cluster-host delete <name>
- deskrun cluster-host list
- deskrun cluster-host configure <name>
Copy link
Owner

@rkoster rkoster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

devbox.json and devbox.lock need to be reverted.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new deskrun cluster-host command to provision and manage NixOS containers on Incus hosts, enabling remote execution of GitHub Actions runners without impacting local development environments.

Key Changes:

  • New ClusterHost type and config management methods for persistent tracking
  • Incus package with CLI wrapper for container lifecycle and NixOS configuration
  • Complete command suite: create, delete, list, and configure cluster hosts with embedded NixOS configuration

Reviewed changes

Copilot reviewed 8 out of 11 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
pkg/types/types.go Adds ClusterHost type with name, image, disk size, and creation timestamp
internal/config/config.go Extends config with ClusterHosts map and management methods (Add, Remove, Get)
internal/incus/incus.go Implements Incus CLI wrapper for container operations and command execution
internal/incus/nixos.go Provides NixOS configuration management with automatic import injection
internal/incus/templates/deskrun.nix Embeds NixOS module configuring Docker, Kind, kubectl, and deskrun
internal/cmd/cluster_host.go Implements cluster-host command with create, delete, list, and configure subcommands
README.md Documents cluster host feature with prerequisites, usage examples, and specifications
devbox.json Changed to symlink (appears to be environment-specific)
devbox.lock Changed to symlink (appears to be environment-specific)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1 to +273
package cmd

import (
"context"
"crypto/rand"
"encoding/hex"
"fmt"
"time"

"github.com/rkoster/deskrun/internal/config"
"github.com/rkoster/deskrun/internal/incus"
"github.com/rkoster/deskrun/pkg/types"
"github.com/spf13/cobra"
)

var (
clusterHostName string
clusterHostDiskSize string
clusterHostImage string
)

var clusterHostCmd = &cobra.Command{
Use: "cluster-host",
Short: "Manage remote Incus cluster hosts",
Long: `Manage remote Incus cluster hosts for running deskrun on dedicated infrastructure.

Cluster hosts are NixOS containers provisioned on Incus with Docker, Kind, and deskrun pre-installed.`,
}

var clusterHostCreateCmd = &cobra.Command{
Use: "create [--name <name>] [--disk <size>] [--image <image>]",
Short: "Create a new cluster host",
Long: `Create a new Incus container with NixOS pre-configured with Docker, Kind, and deskrun.

The container will be created on the current Incus remote (use 'incus remote switch' to change).

Examples:
# Create with auto-generated name
deskrun cluster-host create

# Create with custom name and disk size
deskrun cluster-host create --name my-host --disk 300GiB

# Create with specific NixOS image
deskrun cluster-host create --image images:nixos/24.11`,
RunE: runClusterHostCreate,
}

var clusterHostDeleteCmd = &cobra.Command{
Use: "delete <name>",
Short: "Delete a cluster host",
Long: `Delete a cluster host container and remove it from configuration.`,
Args: cobra.ExactArgs(1),
RunE: runClusterHostDelete,
}

var clusterHostListCmd = &cobra.Command{
Use: "list",
Short: "List all cluster hosts",
Long: `List all cluster hosts with their status and configuration.`,
RunE: runClusterHostList,
}

var clusterHostConfigureCmd = &cobra.Command{
Use: "configure <name>",
Short: "Re-configure a cluster host",
Long: `Re-apply NixOS configuration to a cluster host.

This is useful after deskrun updates or if the initial configuration failed.`,
Args: cobra.ExactArgs(1),
RunE: runClusterHostConfigure,
}

func init() {
clusterHostCreateCmd.Flags().StringVar(&clusterHostName, "name", "", "Container name (auto-generated if not specified)")
clusterHostCreateCmd.Flags().StringVar(&clusterHostDiskSize, "disk", "200GiB", "Root disk size")
clusterHostCreateCmd.Flags().StringVar(&clusterHostImage, "image", "images:nixos/25.11", "NixOS image to use")

clusterHostCmd.AddCommand(clusterHostCreateCmd)
clusterHostCmd.AddCommand(clusterHostDeleteCmd)
clusterHostCmd.AddCommand(clusterHostListCmd)
clusterHostCmd.AddCommand(clusterHostConfigureCmd)
rootCmd.AddCommand(clusterHostCmd)
}

func runClusterHostCreate(cmd *cobra.Command, args []string) error {
name := clusterHostName
if name == "" {
randomBytes := make([]byte, 3)
if _, err := rand.Read(randomBytes); err != nil {
return fmt.Errorf("failed to generate random name: %w", err)
}
name = fmt.Sprintf("deskrun-%s", hex.EncodeToString(randomBytes))
}

configMgr, err := config.NewManager()
if err != nil {
return fmt.Errorf("failed to load config: %w", err)
}

if _, err := configMgr.GetClusterHost(name); err == nil {
return fmt.Errorf("cluster host %s already exists in configuration", name)
}

incusMgr := incus.NewManager()
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
defer cancel()

exists, err := incusMgr.ContainerExists(ctx, name)
if err != nil {
return fmt.Errorf("failed to check if container exists: %w", err)
}
if exists {
return fmt.Errorf("container %s already exists", name)
}

fmt.Printf("Creating cluster host '%s'...\n", name)
fmt.Println("Launching NixOS container...")

if err := incusMgr.CreateContainer(ctx, name, clusterHostImage, clusterHostDiskSize); err != nil {
return fmt.Errorf("failed to create container: %w", err)
}

fmt.Println("Waiting for container to start...")
if err := incusMgr.WaitForRunning(ctx, name, 2*time.Minute); err != nil {
_ = incusMgr.DeleteContainer(ctx, name)
return fmt.Errorf("container failed to start: %w", err)
}

fmt.Println("Configuring NixOS with Docker, Kind, and deskrun...")
if err := incusMgr.ConfigureNixOS(ctx, name); err != nil {
return fmt.Errorf("failed to configure NixOS: %w", err)
}

host := &types.ClusterHost{
Name: name,
Image: clusterHostImage,
DiskSize: clusterHostDiskSize,
CreatedAt: time.Now().Format(time.RFC3339),
}

if err := configMgr.AddClusterHost(host); err != nil {
return fmt.Errorf("failed to save cluster host to config: %w", err)
}

fmt.Printf("\nCluster host '%s' created successfully\n\n", name)
fmt.Printf("To access: incus exec %s -- bash\n", name)
fmt.Printf("To run deskrun: incus exec %s -- deskrun --help\n", name)

return nil
}

func runClusterHostDelete(cmd *cobra.Command, args []string) error {
name := args[0]

configMgr, err := config.NewManager()
if err != nil {
return fmt.Errorf("failed to load config: %w", err)
}

if _, err := configMgr.GetClusterHost(name); err != nil {
fmt.Printf("Warning: cluster host %s not found in configuration\n", name)
}

incusMgr := incus.NewManager()
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute)
defer cancel()

exists, err := incusMgr.ContainerExists(ctx, name)
if err != nil {
return fmt.Errorf("failed to check if container exists: %w", err)
}

if !exists {
fmt.Printf("Container %s does not exist\n", name)
} else {
fmt.Printf("Deleting cluster host '%s'...\n", name)
if err := incusMgr.DeleteContainer(ctx, name); err != nil {
return fmt.Errorf("failed to delete container: %w", err)
}
}

if err := configMgr.RemoveClusterHost(name); err != nil {
fmt.Printf("Note: %v\n", err)
}

fmt.Printf("Cluster host '%s' deleted successfully\n", name)
return nil
}

func runClusterHostList(cmd *cobra.Command, args []string) error {
configMgr, err := config.NewManager()
if err != nil {
return fmt.Errorf("failed to load config: %w", err)
}

incusMgr := incus.NewManager()
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()

containers, err := incusMgr.ListContainers(ctx, "deskrun-")
if err != nil {
return fmt.Errorf("failed to list containers: %w", err)
}

if len(containers) == 0 && len(configMgr.GetConfig().ClusterHosts) == 0 {
fmt.Println("No cluster hosts found")
return nil
}

fmt.Printf("%-20s %-10s %-20s %-10s %-20s\n", "NAME", "STATUS", "IMAGE", "DISK", "CREATED")
fmt.Println("--------------------------------------------------------------------------------------------")

for _, container := range containers {
host, err := configMgr.GetClusterHost(container.Name)
if err != nil {
fmt.Printf("%-20s %-10s %-20s %-10s %-20s\n",
container.Name,
container.Status,
"N/A",
"N/A",
"N/A")
continue
}

createdAt := host.CreatedAt
if t, err := time.Parse(time.RFC3339, host.CreatedAt); err == nil {
createdAt = t.Format("2006-01-02 15:04:05")
}

fmt.Printf("%-20s %-10s %-20s %-10s %-20s\n",
host.Name,
container.Status,
host.Image,
host.DiskSize,
createdAt)
}

return nil
}

func runClusterHostConfigure(cmd *cobra.Command, args []string) error {
name := args[0]

configMgr, err := config.NewManager()
if err != nil {
return fmt.Errorf("failed to load config: %w", err)
}

if _, err := configMgr.GetClusterHost(name); err != nil {
return fmt.Errorf("cluster host %s not found in configuration", name)
}

incusMgr := incus.NewManager()
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
defer cancel()

exists, err := incusMgr.ContainerExists(ctx, name)
if err != nil {
return fmt.Errorf("failed to check if container exists: %w", err)
}
if !exists {
return fmt.Errorf("container %s does not exist", name)
}

fmt.Println("Applying NixOS configuration...")
if err := incusMgr.ConfigureNixOS(ctx, name); err != nil {
return fmt.Errorf("failed to configure NixOS: %w", err)
}

fmt.Println("Configuration applied successfully")
return nil
}
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cluster-host command implementation lacks test coverage. The repository has comprehensive tests for other commands (e.g., add_test.go for the add command). Consider adding tests for the cluster-host subcommands to verify proper error handling, argument validation, and integration with the config and incus packages.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +166
package incus

import (
"context"
"fmt"
"os/exec"
"strings"
"time"
)

type Manager struct{}

type ContainerInfo struct {
Name string
Status string
}

func NewManager() *Manager {
return &Manager{}
}

func (m *Manager) CreateContainer(ctx context.Context, name, image, diskSize string) error {
args := []string{
"launch",
image,
name,
"-d", fmt.Sprintf("root,size=%s", diskSize),
"-c", "security.nesting=true",
}

cmd := exec.CommandContext(ctx, "incus", args...)
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to create container: %w (output: %s)", err, string(output))
}

return nil
}

func (m *Manager) DeleteContainer(ctx context.Context, name string) error {
running, err := m.isRunning(ctx, name)
if err != nil {
return err
}

if running {
stopCmd := exec.CommandContext(ctx, "incus", "stop", name, "--force")
if output, err := stopCmd.CombinedOutput(); err != nil {
return fmt.Errorf("failed to stop container: %w (output: %s)", err, string(output))
}
}

deleteCmd := exec.CommandContext(ctx, "incus", "delete", name)
output, err := deleteCmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to delete container: %w (output: %s)", err, string(output))
}

return nil
}

func (m *Manager) ContainerExists(ctx context.Context, name string) (bool, error) {
cmd := exec.CommandContext(ctx, "incus", "list", "--format=csv", "-c", "n")
output, err := cmd.CombinedOutput()
if err != nil {
return false, fmt.Errorf("failed to list containers: %w (output: %s)", err, string(output))
}

lines := strings.Split(strings.TrimSpace(string(output)), "\n")
for _, line := range lines {
if strings.TrimSpace(line) == name {
return true, nil
}
}

return false, nil
}

func (m *Manager) ListContainers(ctx context.Context, prefix string) ([]ContainerInfo, error) {
cmd := exec.CommandContext(ctx, "incus", "list", "--format=csv", "-c", "ns")
output, err := cmd.CombinedOutput()
if err != nil {
return nil, fmt.Errorf("failed to list containers: %w (output: %s)", err, string(output))
}

var containers []ContainerInfo
lines := strings.Split(strings.TrimSpace(string(output)), "\n")
for _, line := range lines {
if line == "" {
continue
}

parts := strings.Split(line, ",")
if len(parts) < 2 {
continue
}

name := strings.TrimSpace(parts[0])
status := strings.TrimSpace(parts[1])

if prefix == "" || strings.HasPrefix(name, prefix) {
containers = append(containers, ContainerInfo{
Name: name,
Status: status,
})
}
}

return containers, nil
}

func (m *Manager) PushContent(ctx context.Context, container, content, remotePath string) error {
cmd := exec.CommandContext(ctx, "incus", "exec", container, "--", "sh", "-c",
fmt.Sprintf("cat > %s <<'EOF'\n%s\nEOF", remotePath, content))
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to push content: %w (output: %s)", err, string(output))
}

return nil
}

func (m *Manager) Exec(ctx context.Context, container string, command ...string) (string, error) {
args := append([]string{"exec", container, "--"}, command...)
cmd := exec.CommandContext(ctx, "incus", args...)
output, err := cmd.CombinedOutput()
if err != nil {
return string(output), fmt.Errorf("failed to execute command: %w (output: %s)", err, string(output))
}

return string(output), nil
}

func (m *Manager) WaitForRunning(ctx context.Context, name string, timeout time.Duration) error {
deadline := time.Now().Add(timeout)

for time.Now().Before(deadline) {
running, err := m.isRunning(ctx, name)
if err != nil {
return err
}

if running {
return nil
}

select {
case <-ctx.Done():
return ctx.Err()
case <-time.After(2 * time.Second):
}
}

return fmt.Errorf("timeout waiting for container to be running")
}

func (m *Manager) isRunning(ctx context.Context, name string) (bool, error) {
cmd := exec.CommandContext(ctx, "incus", "list", name, "--format=csv", "-c", "s")
output, err := cmd.CombinedOutput()
if err != nil {
return false, fmt.Errorf("failed to check container status: %w (output: %s)", err, string(output))
}

status := strings.TrimSpace(string(output))
return status == "RUNNING", nil
}
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire incus package (incus.go, nixos.go) lacks test coverage. The repository follows a pattern of comprehensive testing (as seen in config_test.go, add_test.go). Consider adding tests for the incus Manager methods, particularly for edge cases like container name validation, timeout handling, and error scenarios from the incus CLI.

Copilot uses AI. Check for mistakes.
Address PR review feedback: devbox.json and devbox.lock were accidentally
converted to symlinks during the previous commit. This restores them to
their original content from the main branch.
@rubionic
Copy link
Collaborator Author

rubionic commented Jan 7, 2026

PR Review Feedback Addressed

I've reverted devbox.json and devbox.lock to their original content from the main branch as requested by @rkoster.

What happened: These files were accidentally converted to symlinks pointing to ../_context/repos/rkoster/deskrun/ during the initial commit in the GitHub Actions environment.

What was fixed:

  • Restored devbox.json to the original file content
  • Restored devbox.lock to the original file content

Commit: 60085d4

rubionic and others added 8 commits January 8, 2026 20:01
- Fix NixOS version inconsistency (24.11 -> 25.11) in docs and examples
- Fix PushContent shell injection vulnerability using stdin redirection
- Improve NixOS imports injection to handle multi-line formats robustly
- Fix list command to show all configured cluster hosts (not just deskrun- prefix)
- Improve ContainerExists to explicitly check for empty output
- Add input validation for CreateContainer (name, image, diskSize)
- Add comprehensive test coverage for ClusterHost config methods

All tests passing (88 specs). Addresses security, robustness, and consistency issues identified in PR review.
- Add network configuration (incusbr0 bridge) to cluster hosts
- Enable security.privileged=true for nested container support
- Add /dev/kmsg device mapping for kubelet compatibility in Kind
- Implement network connectivity verification before configuration
- Add nix-channel update with retry logic for reliable NixOS setup
- Set NIX_PATH explicitly for nixos-rebuild to find channels
- Copy deskrun config.json to cluster host containers
- Add WaitForNetwork() to ensure network readiness
- Add PushConfigFile() to sync configuration to containers

This enables Kind clusters to run successfully inside Incus containers
with proper kernel device access and network connectivity.
Implements intelligent storage pool management for cluster-host creation to
prevent 'no space left on device' errors caused by Docker overlay2 on btrfs.

Changes:
- Add ListStoragePools() to enumerate available Incus storage pools
- Add DetectStorageDriver() to identify pool storage drivers
- Add CreateStoragePool() to create new storage pools
- Add EnsureGoodStoragePool() that:
  * Prefers existing ZFS or dir pools (Docker-compatible)
  * Avoids btrfs pools (incompatible with Docker overlay2)
  * Creates deskrun-pool (ZFS preferred, dir fallback) if needed
- Update CreateContainer() to accept storagePool parameter
- Update cluster-host create to automatically select optimal pool

Technical context:
- Btrfs backing + Docker overlay2 = metadata exhaustion issues
- ZFS backing + Docker overlay2 = optimal for nested containers
- Dir backing + Docker overlay2 = works but less efficient than ZFS

The implementation follows a three-phase selection strategy:
1. Look for existing good pools (zfs/dir)
2. Use default pool if not btrfs
3. Create new deskrun-pool (prefer ZFS, fallback to dir)
Switch from ZFS to dir (directory) driver as the preferred storage backend
for cluster-hosts to minimize overhead and maximize performance.

Changes:
- First pass: Look for existing dir pools only (removed zfs check)
- Second pass: Still use default pool if not btrfs (zfs/lvm/etc still work)
- Create new pool: Always use dir driver (removed zfs creation attempt)
- Simplified error message for pool creation failure

Rationale:
- Dir driver has minimal overhead (just directories on host filesystem)
- No need for snapshots or advanced features in this use case
- ZFS/btrfs add complexity and overhead without performance benefit
- Docker overlay2 works well with dir backing storage
- Simpler, more portable, always works on any system

The dir driver provides better raw performance by eliminating the
filesystem abstraction layer that ZFS/btrfs add.
The devbox.json and devbox.lock files were symlinks pointing to a
non-existent path (../_context/repos/rkoster/deskrun/). This restores
them to regular files with their original content from commit 60085d4.
The previous logic only checked the 'default' pool in the second pass,
which caused it to miss existing non-btrfs pools like 'zfs-pool' or 'lvm'.
This resulted in unnecessary attempts to create a new 'deskrun-pool' which
failed with 'no space left on device' errors.

Changes:
- Second pass now checks ALL pools, not just 'default'
- Returns any non-btrfs pool found (zfs, lvm, dir, etc.)
- Only creates new dir pool if no suitable pools exist

This ensures existing working pools are reused before attempting to
create new ones.
Replace automatic storage pool detection/creation with a simple --storage-pool flag that defaults to 'local'. This change recognizes that:

1. IncusOS installations should follow incus-post-install.md bootstrap process
2. Users are responsible for configuring their storage pools correctly
3. The 'local' pool is standard on IncusOS with ZFS properly configured
4. Automatic pool creation is problematic with API-only remote access

Removed:
- ListStoragePools() - enumeration logic
- DetectStorageDriver() - driver detection logic
- CreateStoragePool() - pool creation logic
- EnsureGoodStoragePool() - complex selection/creation logic

Added:
- --storage-pool flag with default value 'local'
- Direct pool specification by users who need non-default pools

This simplifies the codebase and gives users explicit control over storage pool selection while providing sensible defaults for standard IncusOS setups.
Add IPv6 disable configuration to the deskrun.nix template to prevent
TCP connection delays caused by IPv6 Happy Eyeballs when IPv6 routing
is not properly configured.

This resolves connection timeouts where curl would wait for IPv6 SYN
retransmissions before falling back to IPv4, causing delays of 7-36 seconds
for HTTPS connections.

Changes:
- Set net.ipv6.conf.all.disable_ipv6 = 1
- Set net.ipv6.conf.default.disable_ipv6 = 1
- Set networking.enableIPv6 = false

This configuration will be applied to all new cluster-hosts created with
'deskrun cluster-host create' and can be reapplied to existing hosts with
'deskrun cluster-host configure'.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 9 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +33 to +39
if i < 4 {
// Wait a bit and try updating again
fmt.Println("Channel not ready yet, retrying...")
if _, err := m.Exec(ctx, containerName, "nix-channel", "--update"); err != nil {
return fmt.Errorf("failed to retry nix channel update: %w", err)
}
}
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The retry loop for channel verification doesn't include a delay between retries. This can lead to rapid repeated operations that may not give the system enough time to complete the channel update. Consider adding a small delay (e.g., time.Sleep(2 * time.Second)) between retry attempts.

Copilot uses AI. Check for mistakes.
Comment on lines +55 to +107
if !strings.Contains(configContent, "./deskrun.nix") {
lines := strings.Split(configContent, "\n")
var newLines []string
foundImports := false
insideImports := false
importIndent := ""

for i, line := range lines {
newLines = append(newLines, line)

if !foundImports && strings.Contains(line, "imports") {
foundImports = true
if strings.Contains(line, "[") {
insideImports = true
leadingSpaces := len(line) - len(strings.TrimLeft(line, " \t"))
importIndent = strings.Repeat(" ", leadingSpaces+2)

if strings.Contains(line, "];") || (strings.Contains(line, "]") && strings.Contains(line, ";")) {
insideImports = false
} else {
continue
}
}
} else if foundImports && insideImports {
if !strings.HasPrefix(strings.TrimSpace(line), "./") &&
!strings.HasPrefix(strings.TrimSpace(line), "<") &&
!strings.HasPrefix(strings.TrimSpace(line), "#") {
if strings.Contains(line, "]") {
newLines = append(newLines[:len(newLines)-1], importIndent+"./deskrun.nix", line)
insideImports = false
foundImports = true
continue
}
}
continue
}

if foundImports && !insideImports && i+1 < len(lines) {
nextLine := lines[i+1]
if strings.HasPrefix(strings.TrimSpace(nextLine), "[") {
insideImports = true
leadingSpaces := len(line) - len(strings.TrimLeft(line, " \t"))
importIndent = strings.Repeat(" ", leadingSpaces+2)
}
}
}

if !foundImports {
importLine := " imports = [ ./deskrun.nix ];"
configContent = strings.Replace(configContent, "{", "{\n"+importLine, 1)
} else {
configContent = strings.Join(newLines, "\n")
}
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configuration parsing logic in this function is quite complex and could benefit from refactoring into a separate helper function for better maintainability. The nested conditionals and state tracking (foundImports, insideImports, etc.) make it difficult to follow the logic. Consider extracting the import injection logic into a dedicated function like "injectImportIntoConfig".

Copilot uses AI. Check for mistakes.
name,
"-d", fmt.Sprintf("root,size=%s", diskSize),
"-n", "incusbr0",
"-c", "security.nesting=true",
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting security.privileged=true grants the container extensive system access and should be carefully considered from a security perspective. While necessary for Docker-in-container use cases, consider documenting this security implication in the code comments or ensuring users understand the security trade-offs in the documentation.

Suggested change
"-c", "security.nesting=true",
"-c", "security.nesting=true",
// WARNING: Enabling security.privileged grants the container extensive access to the host system.
// This is typically required only for nested workloads (e.g. Docker-in-container) and should be
// reviewed carefully against your security requirements before use.

Copilot uses AI. Check for mistakes.
}

fmt.Println("Configuring NixOS with Docker, Kind, and deskrun...")
if err := incusMgr.ConfigureNixOS(ctx, name); err != nil {
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If NixOS configuration fails at line 139, the container is not cleaned up (unlike the previous error handlers at lines 128 and 134). This leaves an orphaned container without a corresponding config entry. Consider adding cleanup with "_ = incusMgr.DeleteContainer(ctx, name)" before returning the error to maintain consistency with other error paths.

Suggested change
if err := incusMgr.ConfigureNixOS(ctx, name); err != nil {
if err := incusMgr.ConfigureNixOS(ctx, name); err != nil {
_ = incusMgr.DeleteContainer(ctx, name)

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +122
package incus

import (
"context"
_ "embed"
"fmt"
"strings"
)

//go:embed templates/deskrun.nix
var deskrunNixTemplate string

func (m *Manager) ConfigureNixOS(ctx context.Context, containerName string) error {
// Update nix channels to ensure NIX_PATH is properly set up
fmt.Println("Updating nix channels...")
if _, err := m.Exec(ctx, containerName, "nix-channel", "--update"); err != nil {
return fmt.Errorf("failed to update nix channels: %w", err)
}

// Verify the channel was actually downloaded, with retries
var verifyOutput string
var err error
for i := 0; i < 5; i++ {
verifyOutput, err = m.Exec(ctx, containerName, "ls", "-la", "/nix/var/nix/profiles/per-user/root/channels/")
if err != nil {
return fmt.Errorf("failed to verify channels: %w", err)
}

if strings.Contains(verifyOutput, "nixos ->") {
break
}

if i < 4 {
// Wait a bit and try updating again
fmt.Println("Channel not ready yet, retrying...")
if _, err := m.Exec(ctx, containerName, "nix-channel", "--update"); err != nil {
return fmt.Errorf("failed to retry nix channel update: %w", err)
}
}
}

if !strings.Contains(verifyOutput, "nixos ->") {
return fmt.Errorf("nixos channel symlink not found after update")
}

if err := m.PushContent(ctx, containerName, deskrunNixTemplate, "/etc/nixos/deskrun.nix"); err != nil {
return fmt.Errorf("failed to push deskrun.nix: %w", err)
}

configContent, err := m.Exec(ctx, containerName, "cat", "/etc/nixos/configuration.nix")
if err != nil {
return fmt.Errorf("failed to read configuration.nix: %w", err)
}

if !strings.Contains(configContent, "./deskrun.nix") {
lines := strings.Split(configContent, "\n")
var newLines []string
foundImports := false
insideImports := false
importIndent := ""

for i, line := range lines {
newLines = append(newLines, line)

if !foundImports && strings.Contains(line, "imports") {
foundImports = true
if strings.Contains(line, "[") {
insideImports = true
leadingSpaces := len(line) - len(strings.TrimLeft(line, " \t"))
importIndent = strings.Repeat(" ", leadingSpaces+2)

if strings.Contains(line, "];") || (strings.Contains(line, "]") && strings.Contains(line, ";")) {
insideImports = false
} else {
continue
}
}
} else if foundImports && insideImports {
if !strings.HasPrefix(strings.TrimSpace(line), "./") &&
!strings.HasPrefix(strings.TrimSpace(line), "<") &&
!strings.HasPrefix(strings.TrimSpace(line), "#") {
if strings.Contains(line, "]") {
newLines = append(newLines[:len(newLines)-1], importIndent+"./deskrun.nix", line)
insideImports = false
foundImports = true
continue
}
}
continue
}

if foundImports && !insideImports && i+1 < len(lines) {
nextLine := lines[i+1]
if strings.HasPrefix(strings.TrimSpace(nextLine), "[") {
insideImports = true
leadingSpaces := len(line) - len(strings.TrimLeft(line, " \t"))
importIndent = strings.Repeat(" ", leadingSpaces+2)
}
}
}

if !foundImports {
importLine := " imports = [ ./deskrun.nix ];"
configContent = strings.Replace(configContent, "{", "{\n"+importLine, 1)
} else {
configContent = strings.Join(newLines, "\n")
}

if err := m.PushContent(ctx, containerName, configContent, "/etc/nixos/configuration.nix"); err != nil {
return fmt.Errorf("failed to update configuration.nix: %w", err)
}
}

fmt.Println("Running nixos-rebuild switch (this may take a few minutes)...")
// Run nixos-rebuild with NIX_PATH set to use the channels
nixPathCmd := "export NIX_PATH=\"nixpkgs=/nix/var/nix/profiles/per-user/root/channels/nixos:nixos-config=/etc/nixos/configuration.nix\" && nixos-rebuild switch"
if _, err := m.Exec(ctx, containerName, "bash", "-c", nixPathCmd); err != nil {
return fmt.Errorf("failed to run nixos-rebuild switch: %w", err)
}

return nil
}
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NixOS configuration logic lacks test coverage. Given the complexity of the configuration file parsing and injection logic, consider adding tests to verify correct behavior for various configuration file formats and edge cases.

Copilot uses AI. Check for mistakes.
}

fmt.Println("Copying deskrun configuration to cluster host...")
if err := incusMgr.PushConfigFile(ctx, name, configMgr.GetConfigPath()); err != nil {
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If pushing the config file fails at line 144, the container is not cleaned up, leaving an orphaned container. For consistency with earlier error paths, consider adding cleanup with "_ = incusMgr.DeleteContainer(ctx, name)" before returning the error.

Suggested change
if err := incusMgr.PushConfigFile(ctx, name, configMgr.GetConfigPath()); err != nil {
if err := incusMgr.PushConfigFile(ctx, name, configMgr.GetConfigPath()); err != nil {
_ = incusMgr.DeleteContainer(ctx, name)

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +298
package cmd

import (
"context"
"crypto/rand"
"encoding/hex"
"fmt"
"time"

"github.com/rkoster/deskrun/internal/config"
"github.com/rkoster/deskrun/internal/incus"
"github.com/rkoster/deskrun/pkg/types"
"github.com/spf13/cobra"
)

var (
clusterHostName string
clusterHostDiskSize string
clusterHostImage string
clusterHostStoragePool string
)

var clusterHostCmd = &cobra.Command{
Use: "cluster-host",
Short: "Manage remote Incus cluster hosts",
Long: `Manage remote Incus cluster hosts for running deskrun on dedicated infrastructure.

Cluster hosts are NixOS containers provisioned on Incus with Docker, Kind, and deskrun pre-installed.`,
}

var clusterHostCreateCmd = &cobra.Command{
Use: "create [--name <name>] [--disk <size>] [--image <image>]",
Short: "Create a new cluster host",
Long: `Create a new Incus container with NixOS pre-configured with Docker, Kind, and deskrun.

The container will be created on the current Incus remote (use 'incus remote switch' to change).

Examples:
# Create with auto-generated name
deskrun cluster-host create

# Create with custom name and disk size
deskrun cluster-host create --name my-host --disk 300GiB

# Create with specific NixOS image
deskrun cluster-host create --image images:nixos/25.11`,
RunE: runClusterHostCreate,
}

var clusterHostDeleteCmd = &cobra.Command{
Use: "delete <name>",
Short: "Delete a cluster host",
Long: `Delete a cluster host container and remove it from configuration.`,
Args: cobra.ExactArgs(1),
RunE: runClusterHostDelete,
}

var clusterHostListCmd = &cobra.Command{
Use: "list",
Short: "List all cluster hosts",
Long: `List all cluster hosts with their status and configuration.`,
RunE: runClusterHostList,
}

var clusterHostConfigureCmd = &cobra.Command{
Use: "configure <name>",
Short: "Re-configure a cluster host",
Long: `Re-apply NixOS configuration to a cluster host.

This is useful after deskrun updates or if the initial configuration failed.`,
Args: cobra.ExactArgs(1),
RunE: runClusterHostConfigure,
}

func init() {
clusterHostCreateCmd.Flags().StringVar(&clusterHostName, "name", "", "Container name (auto-generated if not specified)")
clusterHostCreateCmd.Flags().StringVar(&clusterHostDiskSize, "disk", "200GiB", "Root disk size")
clusterHostCreateCmd.Flags().StringVar(&clusterHostImage, "image", "images:nixos/25.11", "NixOS image to use")
clusterHostCreateCmd.Flags().StringVar(&clusterHostStoragePool, "storage-pool", "local", "Incus storage pool to use")

clusterHostCmd.AddCommand(clusterHostCreateCmd)
clusterHostCmd.AddCommand(clusterHostDeleteCmd)
clusterHostCmd.AddCommand(clusterHostListCmd)
clusterHostCmd.AddCommand(clusterHostConfigureCmd)
rootCmd.AddCommand(clusterHostCmd)
}

func runClusterHostCreate(cmd *cobra.Command, args []string) error {
name := clusterHostName
if name == "" {
randomBytes := make([]byte, 3)
if _, err := rand.Read(randomBytes); err != nil {
return fmt.Errorf("failed to generate random name: %w", err)
}
name = fmt.Sprintf("deskrun-%s", hex.EncodeToString(randomBytes))
}

configMgr, err := config.NewManager()
if err != nil {
return fmt.Errorf("failed to load config: %w", err)
}

if _, err := configMgr.GetClusterHost(name); err == nil {
return fmt.Errorf("cluster host %s already exists in configuration", name)
}

incusMgr := incus.NewManager()
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
defer cancel()

exists, err := incusMgr.ContainerExists(ctx, name)
if err != nil {
return fmt.Errorf("failed to check if container exists: %w", err)
}
if exists {
return fmt.Errorf("container %s already exists", name)
}

fmt.Printf("Creating cluster host '%s'...\n", name)

fmt.Println("Launching NixOS container...")
if err := incusMgr.CreateContainer(ctx, name, clusterHostImage, clusterHostDiskSize, clusterHostStoragePool); err != nil {
return fmt.Errorf("failed to create container: %w", err)
}

fmt.Println("Waiting for container to start...")
if err := incusMgr.WaitForRunning(ctx, name, 2*time.Minute); err != nil {
_ = incusMgr.DeleteContainer(ctx, name)
return fmt.Errorf("container failed to start: %w", err)
}

fmt.Println("Waiting for network connectivity...")
if err := incusMgr.WaitForNetwork(ctx, name, 2*time.Minute); err != nil {
_ = incusMgr.DeleteContainer(ctx, name)
return fmt.Errorf("network failed to initialize: %w", err)
}

fmt.Println("Configuring NixOS with Docker, Kind, and deskrun...")
if err := incusMgr.ConfigureNixOS(ctx, name); err != nil {
return fmt.Errorf("failed to configure NixOS: %w", err)
}

fmt.Println("Copying deskrun configuration to cluster host...")
if err := incusMgr.PushConfigFile(ctx, name, configMgr.GetConfigPath()); err != nil {
return fmt.Errorf("failed to push config file: %w", err)
}

host := &types.ClusterHost{
Name: name,
Image: clusterHostImage,
DiskSize: clusterHostDiskSize,
CreatedAt: time.Now().Format(time.RFC3339),
}

if err := configMgr.AddClusterHost(host); err != nil {
return fmt.Errorf("failed to save cluster host to config: %w", err)
}

fmt.Printf("\nCluster host '%s' created successfully\n\n", name)
fmt.Printf("To access: incus exec %s -- bash\n", name)
fmt.Printf("To run deskrun: incus exec %s -- deskrun --help\n", name)

return nil
}

func runClusterHostDelete(cmd *cobra.Command, args []string) error {
name := args[0]

configMgr, err := config.NewManager()
if err != nil {
return fmt.Errorf("failed to load config: %w", err)
}

if _, err := configMgr.GetClusterHost(name); err != nil {
fmt.Printf("Warning: cluster host %s not found in configuration\n", name)
}

incusMgr := incus.NewManager()
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute)
defer cancel()

exists, err := incusMgr.ContainerExists(ctx, name)
if err != nil {
return fmt.Errorf("failed to check if container exists: %w", err)
}

if !exists {
fmt.Printf("Container %s does not exist\n", name)
} else {
fmt.Printf("Deleting cluster host '%s'...\n", name)
if err := incusMgr.DeleteContainer(ctx, name); err != nil {
return fmt.Errorf("failed to delete container: %w", err)
}
}

if err := configMgr.RemoveClusterHost(name); err != nil {
fmt.Printf("Note: %v\n", err)
}

fmt.Printf("Cluster host '%s' deleted successfully\n", name)
return nil
}

func runClusterHostList(cmd *cobra.Command, args []string) error {
configMgr, err := config.NewManager()
if err != nil {
return fmt.Errorf("failed to load config: %w", err)
}

incusMgr := incus.NewManager()
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()

allContainers, err := incusMgr.ListContainers(ctx, "")
if err != nil {
return fmt.Errorf("failed to list containers: %w", err)
}

configuredHosts := make(map[string]bool)
for name := range configMgr.GetConfig().ClusterHosts {
configuredHosts[name] = true
}

var containers []incus.ContainerInfo
for _, container := range allContainers {
if configuredHosts[container.Name] {
containers = append(containers, container)
}
}

if len(containers) == 0 && len(configMgr.GetConfig().ClusterHosts) == 0 {
fmt.Println("No cluster hosts found")
return nil
}

fmt.Printf("%-20s %-10s %-20s %-10s %-20s\n", "NAME", "STATUS", "IMAGE", "DISK", "CREATED")
fmt.Println("--------------------------------------------------------------------------------------------")

for _, container := range containers {
host, err := configMgr.GetClusterHost(container.Name)
if err != nil {
fmt.Printf("%-20s %-10s %-20s %-10s %-20s\n",
container.Name,
container.Status,
"N/A",
"N/A",
"N/A")
continue
}

createdAt := host.CreatedAt
if t, err := time.Parse(time.RFC3339, host.CreatedAt); err == nil {
createdAt = t.Format("2006-01-02 15:04:05")
}

fmt.Printf("%-20s %-10s %-20s %-10s %-20s\n",
host.Name,
container.Status,
host.Image,
host.DiskSize,
createdAt)
}

return nil
}

func runClusterHostConfigure(cmd *cobra.Command, args []string) error {
name := args[0]

configMgr, err := config.NewManager()
if err != nil {
return fmt.Errorf("failed to load config: %w", err)
}

if _, err := configMgr.GetClusterHost(name); err != nil {
return fmt.Errorf("cluster host %s not found in configuration", name)
}

incusMgr := incus.NewManager()
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
defer cancel()

exists, err := incusMgr.ContainerExists(ctx, name)
if err != nil {
return fmt.Errorf("failed to check if container exists: %w", err)
}
if !exists {
return fmt.Errorf("container %s does not exist", name)
}

fmt.Println("Applying NixOS configuration...")
if err := incusMgr.ConfigureNixOS(ctx, name); err != nil {
return fmt.Errorf("failed to configure NixOS: %w", err)
}

fmt.Println("Configuration applied successfully")
return nil
}
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new cluster-host command implementation lacks test coverage. Other commands in this package (e.g., add_test.go, status_test.go) have comprehensive tests. Consider adding tests for the cluster-host commands to maintain consistency with the existing test coverage pattern in the codebase.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +263
package incus

import (
"context"
"fmt"
"os/exec"
"strings"
"time"
)

type Manager struct{}

type ContainerInfo struct {
Name string
Status string
}

func NewManager() *Manager {
return &Manager{}
}

func (m *Manager) CreateContainer(ctx context.Context, name, image, diskSize, storagePool string) error {
if name == "" {
return fmt.Errorf("container name cannot be empty")
}
if strings.ContainsAny(name, " /\\:@#$%^&*()[]{}!?'\"<>,;|`~+=") {
return fmt.Errorf("container name contains invalid characters: %s", name)
}
if image == "" {
return fmt.Errorf("image cannot be empty")
}
if diskSize == "" {
return fmt.Errorf("disk size cannot be empty")
}
if !strings.HasSuffix(diskSize, "GiB") && !strings.HasSuffix(diskSize, "GB") &&
!strings.HasSuffix(diskSize, "MiB") && !strings.HasSuffix(diskSize, "MB") {
return fmt.Errorf("disk size must end with GiB, GB, MiB, or MB: %s", diskSize)
}

// Ensure the default bridge network exists
if err := m.ensureNetwork(ctx); err != nil {
return fmt.Errorf("failed to ensure network: %w", err)
}

args := []string{
"launch",
image,
name,
"-d", fmt.Sprintf("root,size=%s", diskSize),
"-n", "incusbr0",
"-c", "security.nesting=true",
"-c", "security.privileged=true",
}

// Add storage pool if specified
if storagePool != "" {
args = append(args, "-s", storagePool)
}

cmd := exec.CommandContext(ctx, "incus", args...)
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to create container: %w (output: %s)", err, string(output))
}

// Add /dev/kmsg device for kubelet
addDeviceCmd := exec.CommandContext(ctx, "incus", "config", "device", "add", name, "kmsg", "unix-char", "source=/dev/kmsg", "path=/dev/kmsg")
output, err = addDeviceCmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to add kmsg device: %w (output: %s)", err, string(output))
}

return nil
}

func (m *Manager) ensureNetwork(ctx context.Context) error {
// Check if incusbr0 network exists
cmd := exec.CommandContext(ctx, "incus", "network", "show", "incusbr0")
if err := cmd.Run(); err == nil {
return nil // Network already exists
}

// Create the bridge network
cmd = exec.CommandContext(ctx, "incus", "network", "create", "incusbr0", "--type=bridge")
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to create network: %w (output: %s)", err, string(output))
}

return nil
}

func (m *Manager) DeleteContainer(ctx context.Context, name string) error {
running, err := m.isRunning(ctx, name)
if err != nil {
return err
}

if running {
stopCmd := exec.CommandContext(ctx, "incus", "stop", name, "--force")
if output, err := stopCmd.CombinedOutput(); err != nil {
return fmt.Errorf("failed to stop container: %w (output: %s)", err, string(output))
}
}

deleteCmd := exec.CommandContext(ctx, "incus", "delete", name)
output, err := deleteCmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to delete container: %w (output: %s)", err, string(output))
}

return nil
}

func (m *Manager) ContainerExists(ctx context.Context, name string) (bool, error) {
cmd := exec.CommandContext(ctx, "incus", "list", "--format=csv", "-c", "n")
output, err := cmd.CombinedOutput()
if err != nil {
return false, fmt.Errorf("failed to list containers: %w (output: %s)", err, string(output))
}

outputStr := strings.TrimSpace(string(output))
if outputStr == "" {
return false, nil
}

lines := strings.Split(outputStr, "\n")
for _, line := range lines {
if strings.TrimSpace(line) == name {
return true, nil
}
}

return false, nil
}

func (m *Manager) ListContainers(ctx context.Context, prefix string) ([]ContainerInfo, error) {
cmd := exec.CommandContext(ctx, "incus", "list", "--format=csv", "-c", "ns")
output, err := cmd.CombinedOutput()
if err != nil {
return nil, fmt.Errorf("failed to list containers: %w (output: %s)", err, string(output))
}

var containers []ContainerInfo
lines := strings.Split(strings.TrimSpace(string(output)), "\n")
for _, line := range lines {
if line == "" {
continue
}

parts := strings.Split(line, ",")
if len(parts) < 2 {
continue
}

name := strings.TrimSpace(parts[0])
status := strings.TrimSpace(parts[1])

if prefix == "" || strings.HasPrefix(name, prefix) {
containers = append(containers, ContainerInfo{
Name: name,
Status: status,
})
}
}

return containers, nil
}

func (m *Manager) PushContent(ctx context.Context, container, content, remotePath string) error {
// Use stdin redirection to avoid shell injection vulnerabilities
cmd := exec.CommandContext(ctx, "incus", "exec", container, "--", "sh", "-c", `cat > "$1"`, "sh", remotePath)
cmd.Stdin = strings.NewReader(content)
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to push content: %w (output: %s)", err, string(output))
}

return nil
}

func (m *Manager) Exec(ctx context.Context, container string, command ...string) (string, error) {
args := append([]string{"exec", container, "--"}, command...)
cmd := exec.CommandContext(ctx, "incus", args...)
output, err := cmd.CombinedOutput()
if err != nil {
return string(output), fmt.Errorf("failed to execute command: %w (output: %s)", err, string(output))
}

return string(output), nil
}

func (m *Manager) WaitForRunning(ctx context.Context, name string, timeout time.Duration) error {
deadline := time.Now().Add(timeout)

for time.Now().Before(deadline) {
running, err := m.isRunning(ctx, name)
if err != nil {
return err
}

if running {
return nil
}

select {
case <-ctx.Done():
return ctx.Err()
case <-time.After(2 * time.Second):
}
}

return fmt.Errorf("timeout waiting for container to be running")
}

func (m *Manager) WaitForNetwork(ctx context.Context, name string, timeout time.Duration) error {
deadline := time.Now().Add(timeout)

for time.Now().Before(deadline) {
// Try to ping a well-known DNS server to check network connectivity
_, err := m.Exec(ctx, name, "timeout", "2", "ping", "-c", "1", "1.1.1.1")
if err == nil {
return nil
}

select {
case <-ctx.Done():
return ctx.Err()
case <-time.After(2 * time.Second):
}
}

return fmt.Errorf("timeout waiting for network connectivity")
}

func (m *Manager) isRunning(ctx context.Context, name string) (bool, error) {
cmd := exec.CommandContext(ctx, "incus", "list", name, "--format=csv", "-c", "s")
output, err := cmd.CombinedOutput()
if err != nil {
return false, fmt.Errorf("failed to check container status: %w (output: %s)", err, string(output))
}

status := strings.TrimSpace(string(output))
return status == "RUNNING", nil
}

func (m *Manager) PushConfigFile(ctx context.Context, containerName, configPath string) error {
// Create .deskrun directory in container
if _, err := m.Exec(ctx, containerName, "mkdir", "-p", "/root/.deskrun"); err != nil {
return fmt.Errorf("failed to create config directory: %w", err)
}

// Use incus file push to copy the config file
cmd := exec.CommandContext(ctx, "incus", "file", "push", configPath, fmt.Sprintf("%s/root/.deskrun/config.json", containerName))
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to push config file: %w (output: %s)", err, string(output))
}

return nil
}


Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new incus package lacks test coverage. Consider adding unit tests for the Incus manager methods, particularly for container lifecycle operations, validation logic, and error handling scenarios.

Copilot uses AI. Check for mistakes.
Use: "cluster-host",
Short: "Manage remote Incus cluster hosts",
Long: `Manage remote Incus cluster hosts for running deskrun on dedicated infrastructure.

Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Long field contains inconsistent indentation with a leading tab followed by spaces. The first line starts with a tab character followed by spaces, which creates irregular indentation. Remove the leading tab on line 27 to maintain consistent formatting with other commands in the codebase.

Suggested change

Copilot uses AI. Check for mistakes.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +55 to +112
if !strings.Contains(configContent, "./deskrun.nix") {
lines := strings.Split(configContent, "\n")
var newLines []string
foundImports := false
insideImports := false
importIndent := ""

for i, line := range lines {
newLines = append(newLines, line)

if !foundImports && strings.Contains(line, "imports") {
foundImports = true
if strings.Contains(line, "[") {
insideImports = true
leadingSpaces := len(line) - len(strings.TrimLeft(line, " \t"))
importIndent = strings.Repeat(" ", leadingSpaces+2)

if strings.Contains(line, "];") || (strings.Contains(line, "]") && strings.Contains(line, ";")) {
insideImports = false
} else {
continue
}
}
} else if foundImports && insideImports {
if !strings.HasPrefix(strings.TrimSpace(line), "./") &&
!strings.HasPrefix(strings.TrimSpace(line), "<") &&
!strings.HasPrefix(strings.TrimSpace(line), "#") {
if strings.Contains(line, "]") {
newLines = append(newLines[:len(newLines)-1], importIndent+"./deskrun.nix", line)
insideImports = false
foundImports = true
continue
}
}
continue
}

if foundImports && !insideImports && i+1 < len(lines) {
nextLine := lines[i+1]
if strings.HasPrefix(strings.TrimSpace(nextLine), "[") {
insideImports = true
leadingSpaces := len(line) - len(strings.TrimLeft(line, " \t"))
importIndent = strings.Repeat(" ", leadingSpaces+2)
}
}
}

if !foundImports {
importLine := " imports = [ ./deskrun.nix ];"
configContent = strings.Replace(configContent, "{", "{\n"+importLine, 1)
} else {
configContent = strings.Join(newLines, "\n")
}

if err := m.PushContent(ctx, containerName, configContent, "/etc/nixos/configuration.nix"); err != nil {
return fmt.Errorf("failed to update configuration.nix: %w", err)
}
}
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import manipulation logic is complex and fragile, relying on string matching for "imports", "[", "]", "./", "<", and "#". This approach could fail with various NixOS configuration.nix file formats (e.g., multi-line imports, different whitespace, comments in imports). Consider using a more robust approach such as parsing the Nix configuration with a proper parser, or at minimum add comments explaining the expected format and edge cases this logic handles.

Copilot uses AI. Check for mistakes.
Comment on lines +23 to +40
for i := 0; i < 5; i++ {
verifyOutput, err = m.Exec(ctx, containerName, "ls", "-la", "/nix/var/nix/profiles/per-user/root/channels/")
if err != nil {
return fmt.Errorf("failed to verify channels: %w", err)
}

if strings.Contains(verifyOutput, "nixos ->") {
break
}

if i < 4 {
// Wait a bit and try updating again
fmt.Println("Channel not ready yet, retrying...")
if _, err := m.Exec(ctx, containerName, "nix-channel", "--update"); err != nil {
return fmt.Errorf("failed to retry nix channel update: %w", err)
}
}
}
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The retry logic does not include a delay between retries. Without a delay, all 5 retries could execute in rapid succession, potentially not giving the channel update enough time to complete. Consider adding a time.Sleep between retries to allow for proper channel propagation.

Copilot uses AI. Check for mistakes.
Comment on lines +262 to +263


Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two consecutive blank lines at the end of the file. Go style typically uses a single blank line at the end of files.

Suggested change

Copilot uses AI. Check for mistakes.
Comment on lines +139 to +144
if err := incusMgr.ConfigureNixOS(ctx, name); err != nil {
return fmt.Errorf("failed to configure NixOS: %w", err)
}

fmt.Println("Copying deskrun configuration to cluster host...")
if err := incusMgr.PushConfigFile(ctx, name, configMgr.GetConfigPath()); err != nil {
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ConfigureNixOS fails after line 141, the container is not cleaned up, leaving behind a partially configured container. Consider wrapping ConfigureNixOS and PushConfigFile in a cleanup block similar to lines 128 and 134, so that if configuration fails, the container is deleted to avoid resource leaks.

Suggested change
if err := incusMgr.ConfigureNixOS(ctx, name); err != nil {
return fmt.Errorf("failed to configure NixOS: %w", err)
}
fmt.Println("Copying deskrun configuration to cluster host...")
if err := incusMgr.PushConfigFile(ctx, name, configMgr.GetConfigPath()); err != nil {
if err := incusMgr.ConfigureNixOS(ctx, name); err != nil {
_ = incusMgr.DeleteContainer(ctx, name)
return fmt.Errorf("failed to configure NixOS: %w", err)
}
fmt.Println("Copying deskrun configuration to cluster host...")
if err := incusMgr.PushConfigFile(ctx, name, configMgr.GetConfigPath()); err != nil {
_ = incusMgr.DeleteContainer(ctx, name)

Copilot uses AI. Check for mistakes.
Comment on lines +139 to +144
if err := incusMgr.ConfigureNixOS(ctx, name); err != nil {
return fmt.Errorf("failed to configure NixOS: %w", err)
}

fmt.Println("Copying deskrun configuration to cluster host...")
if err := incusMgr.PushConfigFile(ctx, name, configMgr.GetConfigPath()); err != nil {
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If PushConfigFile fails after line 146, the container is not cleaned up, leaving behind a partially configured container. Consider wrapping ConfigureNixOS and PushConfigFile in a cleanup block similar to lines 128 and 134, so that if configuration fails, the container is deleted to avoid resource leaks.

Suggested change
if err := incusMgr.ConfigureNixOS(ctx, name); err != nil {
return fmt.Errorf("failed to configure NixOS: %w", err)
}
fmt.Println("Copying deskrun configuration to cluster host...")
if err := incusMgr.PushConfigFile(ctx, name, configMgr.GetConfigPath()); err != nil {
if err := incusMgr.ConfigureNixOS(ctx, name); err != nil {
_ = incusMgr.DeleteContainer(ctx, name)
return fmt.Errorf("failed to configure NixOS: %w", err)
}
fmt.Println("Copying deskrun configuration to cluster host...")
if err := incusMgr.PushConfigFile(ctx, name, configMgr.GetConfigPath()); err != nil {
_ = incusMgr.DeleteContainer(ctx, name)

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +263
package incus

import (
"context"
"fmt"
"os/exec"
"strings"
"time"
)

type Manager struct{}

type ContainerInfo struct {
Name string
Status string
}

func NewManager() *Manager {
return &Manager{}
}

func (m *Manager) CreateContainer(ctx context.Context, name, image, diskSize, storagePool string) error {
if name == "" {
return fmt.Errorf("container name cannot be empty")
}
if strings.ContainsAny(name, " /\\:@#$%^&*()[]{}!?'\"<>,;|`~+=") {
return fmt.Errorf("container name contains invalid characters: %s", name)
}
if image == "" {
return fmt.Errorf("image cannot be empty")
}
if diskSize == "" {
return fmt.Errorf("disk size cannot be empty")
}
if !strings.HasSuffix(diskSize, "GiB") && !strings.HasSuffix(diskSize, "GB") &&
!strings.HasSuffix(diskSize, "MiB") && !strings.HasSuffix(diskSize, "MB") {
return fmt.Errorf("disk size must end with GiB, GB, MiB, or MB: %s", diskSize)
}

// Ensure the default bridge network exists
if err := m.ensureNetwork(ctx); err != nil {
return fmt.Errorf("failed to ensure network: %w", err)
}

args := []string{
"launch",
image,
name,
"-d", fmt.Sprintf("root,size=%s", diskSize),
"-n", "incusbr0",
"-c", "security.nesting=true",
"-c", "security.privileged=true",
}

// Add storage pool if specified
if storagePool != "" {
args = append(args, "-s", storagePool)
}

cmd := exec.CommandContext(ctx, "incus", args...)
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to create container: %w (output: %s)", err, string(output))
}

// Add /dev/kmsg device for kubelet
addDeviceCmd := exec.CommandContext(ctx, "incus", "config", "device", "add", name, "kmsg", "unix-char", "source=/dev/kmsg", "path=/dev/kmsg")
output, err = addDeviceCmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to add kmsg device: %w (output: %s)", err, string(output))
}

return nil
}

func (m *Manager) ensureNetwork(ctx context.Context) error {
// Check if incusbr0 network exists
cmd := exec.CommandContext(ctx, "incus", "network", "show", "incusbr0")
if err := cmd.Run(); err == nil {
return nil // Network already exists
}

// Create the bridge network
cmd = exec.CommandContext(ctx, "incus", "network", "create", "incusbr0", "--type=bridge")
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to create network: %w (output: %s)", err, string(output))
}

return nil
}

func (m *Manager) DeleteContainer(ctx context.Context, name string) error {
running, err := m.isRunning(ctx, name)
if err != nil {
return err
}

if running {
stopCmd := exec.CommandContext(ctx, "incus", "stop", name, "--force")
if output, err := stopCmd.CombinedOutput(); err != nil {
return fmt.Errorf("failed to stop container: %w (output: %s)", err, string(output))
}
}

deleteCmd := exec.CommandContext(ctx, "incus", "delete", name)
output, err := deleteCmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to delete container: %w (output: %s)", err, string(output))
}

return nil
}

func (m *Manager) ContainerExists(ctx context.Context, name string) (bool, error) {
cmd := exec.CommandContext(ctx, "incus", "list", "--format=csv", "-c", "n")
output, err := cmd.CombinedOutput()
if err != nil {
return false, fmt.Errorf("failed to list containers: %w (output: %s)", err, string(output))
}

outputStr := strings.TrimSpace(string(output))
if outputStr == "" {
return false, nil
}

lines := strings.Split(outputStr, "\n")
for _, line := range lines {
if strings.TrimSpace(line) == name {
return true, nil
}
}

return false, nil
}

func (m *Manager) ListContainers(ctx context.Context, prefix string) ([]ContainerInfo, error) {
cmd := exec.CommandContext(ctx, "incus", "list", "--format=csv", "-c", "ns")
output, err := cmd.CombinedOutput()
if err != nil {
return nil, fmt.Errorf("failed to list containers: %w (output: %s)", err, string(output))
}

var containers []ContainerInfo
lines := strings.Split(strings.TrimSpace(string(output)), "\n")
for _, line := range lines {
if line == "" {
continue
}

parts := strings.Split(line, ",")
if len(parts) < 2 {
continue
}

name := strings.TrimSpace(parts[0])
status := strings.TrimSpace(parts[1])

if prefix == "" || strings.HasPrefix(name, prefix) {
containers = append(containers, ContainerInfo{
Name: name,
Status: status,
})
}
}

return containers, nil
}

func (m *Manager) PushContent(ctx context.Context, container, content, remotePath string) error {
// Use stdin redirection to avoid shell injection vulnerabilities
cmd := exec.CommandContext(ctx, "incus", "exec", container, "--", "sh", "-c", `cat > "$1"`, "sh", remotePath)
cmd.Stdin = strings.NewReader(content)
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to push content: %w (output: %s)", err, string(output))
}

return nil
}

func (m *Manager) Exec(ctx context.Context, container string, command ...string) (string, error) {
args := append([]string{"exec", container, "--"}, command...)
cmd := exec.CommandContext(ctx, "incus", args...)
output, err := cmd.CombinedOutput()
if err != nil {
return string(output), fmt.Errorf("failed to execute command: %w (output: %s)", err, string(output))
}

return string(output), nil
}

func (m *Manager) WaitForRunning(ctx context.Context, name string, timeout time.Duration) error {
deadline := time.Now().Add(timeout)

for time.Now().Before(deadline) {
running, err := m.isRunning(ctx, name)
if err != nil {
return err
}

if running {
return nil
}

select {
case <-ctx.Done():
return ctx.Err()
case <-time.After(2 * time.Second):
}
}

return fmt.Errorf("timeout waiting for container to be running")
}

func (m *Manager) WaitForNetwork(ctx context.Context, name string, timeout time.Duration) error {
deadline := time.Now().Add(timeout)

for time.Now().Before(deadline) {
// Try to ping a well-known DNS server to check network connectivity
_, err := m.Exec(ctx, name, "timeout", "2", "ping", "-c", "1", "1.1.1.1")
if err == nil {
return nil
}

select {
case <-ctx.Done():
return ctx.Err()
case <-time.After(2 * time.Second):
}
}

return fmt.Errorf("timeout waiting for network connectivity")
}

func (m *Manager) isRunning(ctx context.Context, name string) (bool, error) {
cmd := exec.CommandContext(ctx, "incus", "list", name, "--format=csv", "-c", "s")
output, err := cmd.CombinedOutput()
if err != nil {
return false, fmt.Errorf("failed to check container status: %w (output: %s)", err, string(output))
}

status := strings.TrimSpace(string(output))
return status == "RUNNING", nil
}

func (m *Manager) PushConfigFile(ctx context.Context, containerName, configPath string) error {
// Create .deskrun directory in container
if _, err := m.Exec(ctx, containerName, "mkdir", "-p", "/root/.deskrun"); err != nil {
return fmt.Errorf("failed to create config directory: %w", err)
}

// Use incus file push to copy the config file
cmd := exec.CommandContext(ctx, "incus", "file", "push", configPath, fmt.Sprintf("%s/root/.deskrun/config.json", containerName))
output, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("failed to push config file: %w (output: %s)", err, string(output))
}

return nil
}


Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The incus package lacks test coverage. Given that other packages like internal/runner and internal/config have comprehensive test coverage, this package should also include unit tests for its functions, especially for the complex NixOS configuration manipulation logic, container lifecycle operations, and error handling paths.

Copilot uses AI. Check for mistakes.
OpenCode and others added 2 commits January 9, 2026 15:33
- Add 3-second delay between NixOS channel verification retries
- Add container cleanup on ConfigureNixOS failure
- Add container cleanup on PushConfigFile failure
- Rebase branch against main

Addresses all critical review comments from Copilot to improve
robustness and resource cleanup on failures.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add deskrun cluster-host command for provisioning Incus containers with NixOS

3 participants