Skip to content
This repository was archived by the owner on Jun 11, 2025. It is now read-only.

Conversation

@nxtcoder17
Copy link
Member

@nxtcoder17 nxtcoder17 commented Jun 4, 2025

Summary by Sourcery

Implement Workspace and WorkMachine custom resources and controllers to enable dynamic provisioning of virtual workspaces and machines, refactor existing operators for improved consistency, and update CRD schemas and infrastructure utilities.

New Features:

  • Add WorkMachine CRD and controller with Terraform-backed lifecycle jobs for cloud VM provisioning.
  • Add Workspace CRD and controller to deploy user workspaces with SSH, TTYD, Jupyter, code-server, and VSCode support.
  • Register workspace and workmachine operators in the platform and agent operator entry points.

Bug Fixes:

  • Ignore deployment revision annotation in reconcile filter to reduce unnecessary reconciliations.
  • Improve YAMLClient ApplyYAML error handling to correctly distinguish not-found and other errors.

Enhancements:

  • Refactor router controller checklist, inline ingress creation, and simplify HTTPS/basic-auth flows.
  • Rename and streamline app controller router handling to require explicit routes.
  • Migrate YAML client to use go.pkgs/log, improve metadata preservation, and enhance error handling.
  • Update reconciler framework to use default logger, rename CheckWrapper, and remove redundant default-patching logic.
  • Convert Router CRD from domains list to Routes with host and service, and add nginxIngressAnnotations support.
  • Bump key dependencies: json-patch, k8s API to v0.32.1, controller-runtime to v0.20.2, operator toolkit and plugin versions.

@nxtcoder17 nxtcoder17 requested a review from karthik1729 as a code owner June 4, 2025 17:56
@sourcery-ai
Copy link

sourcery-ai bot commented Jun 4, 2025

Reviewer's Guide

This PR implements two new CRDs, Workspace and WorkMachine, complete with controllers, reconciler logic, templates and RBAC, while also refactoring the existing router controller and request/YAML client infrastructure to simplify checklist handling, modularize domain and ingress logic, standardize logging, and bump dependencies and CRD schemas.

Sequence Diagram for Updated Router Controller Reconciliation

sequenceDiagram
    participant Reconciler as RouterReconciler
    participant Request
    participant K8sAPI

    RouterReconciler->>Request: EnsureCheckList(["EnsuringHttpsCertsIfEnabled", "SettingUpBasicAuthIfEnabled", "CreateIngressResource"])
    Request-->>RouterReconciler: Proceed

    RouterReconciler->>RouterReconciler: reconBasicAuth(req)
    alt Basic Auth Enabled and SecretName not set
        RouterReconciler->>K8sAPI: Update Router CR (set .Spec.BasicAuth.SecretName)
        K8sAPI-->>RouterReconciler: Updated Router CR
    end
    alt Basic Auth Enabled
        RouterReconciler->>K8sAPI: CreateOrUpdate Secret (basic-auth)
        K8sAPI-->>RouterReconciler: Secret
        RouterReconciler->>Request: AddToOwnedResources(Secret)
    end

    RouterReconciler->>RouterReconciler: ensureIngresses(req)
    alt IngressClass not set in Spec
        RouterReconciler->>RouterReconciler: findIngressClass(req)
        RouterReconciler->>K8sAPI: List IngressClassList
        K8sAPI-->>RouterReconciler: IngressClassList (or error)
        RouterReconciler->>K8sAPI: Update Router CR (set .Spec.IngressClass)
        K8sAPI-->>RouterReconciler: Updated Router CR
    end
    alt HTTPS enabled and ClusterIssuer not set in Spec
        RouterReconciler->>RouterReconciler: findClusterIssuer(req)
        RouterReconciler->>K8sAPI: List ClusterIssuerList / Get ClusterIssuer
        K8sAPI-->>RouterReconciler: ClusterIssuer (or error)
        RouterReconciler->>K8sAPI: Update Router CR (set .Spec.Https.ClusterIssuer)
        K8sAPI-->>RouterReconciler: Updated Router CR
    end
    RouterReconciler->>RouterReconciler: groupHostsByKind(issuer, obj) # new helper
    RouterReconciler->>RouterReconciler: templates.ParseBytes(templateIngress, ...)
    RouterReconciler->>K8sAPI: ApplyYAML (Ingress)
    K8sAPI-->>RouterReconciler: Ingress ResourceRefs
    RouterReconciler->>Request: AddToOwnedResources(Ingress)
Loading

Sequence Diagram for WorkMachine Creation Flow

sequenceDiagram
    participant User/System
    participant WMR as WorkMachineReconciler
    participant K8sAPI
    participant LFCRD as LifecycleCRD
    participant IACJobPod
    participant CloudProviderAPI

    User/System->>K8sAPI: Create WorkMachine CR
    K8sAPI->>WMR: Reconcile WorkMachine
    WMR->>WMR: createWorkMachineCreationJob(req)
    WMR->>WMR: parseSpecIntoTFValues()
    WMR->>K8sAPI: CreateOrUpdate Lifecycle CR (IAC job spec)
    K8sAPI-->>WMR: Lifecycle CR
    note right of K8sAPI: Lifecycle Controller reconciles LFCRD,
    note right of K8sAPI: creates Job which runs IACJobPod.
    IACJobPod->>IACJobPod: Run Terraform (apply)
    IACJobPod->>CloudProviderAPI: Provision Resources (e.g., EC2 instance)
    CloudProviderAPI-->>IACJobPod: Resources Provisioned
    IACJobPod->>K8sAPI: Update Lifecycle CR Status (Completed)
    K8sAPI->>WMR: Reconcile WorkMachine (Lifecycle CR completed)
    WMR->>WMR: createTargetNamespace(req)
    WMR->>K8sAPI: CreateOrUpdate Namespace
    WMR->>WMR: createSSHPublicKeysSecret(req)
    WMR->>K8sAPI: CreateOrUpdate Secret (with user + machine SSH keys)
    K8sAPI-->>WMR: Secret
    WMR->>K8sAPI: Update WorkMachine Status (IsReady=true)
Loading

Class Diagram for Updated RouterSpec and Route Types

classDiagram
    class RouterSpec {
        -IngressClass string
        -BackendProtocol *string
        -Https *Https
        -RateLimit *RateLimit
        -MaxBodySizeInMB *int
        -BasicAuth *BasicAuth
        -Cors *Cors
        +NginxIngressAnnotations map~string,string~
        +Routes []Route
    }
    class Route {
        +Host string
        +Service string
        +Path string
        +Port uint16
        +Rewrite bool
    }
    RouterSpec *-- "1..*" Route : contains
    note for RouterSpec "Removed 'Domains []string' field, added 'NginxIngressAnnotations'"
Loading

Class Diagram for New WorkMachine CRD

classDiagram
    class WorkMachine {
        +Spec WorkMachineSpec
        +Status common_types.Status
    }
    class WorkMachineSpec {
        +SSHPublicKeys []string
        +JobParams WorkMachineJobParams
        +AWSMachineConfig *AWSMachineConfig
        +State crdsv1.WorkMachineState
        +TargetNamespace string
        GetCloudProvider() common_types.CloudProvider
    }
    class AWSMachineConfig {
        +AMI string
        +InstanceType string
        +RootVolumeSize int
        +RootVolumeType string
        +ExternalVolumeSize int
        +ExternalVolumeType string
        +IAMInstanceProfileRole *string
    }
    class WorkMachineJobParams {
        +NodeSelector map~string,string~
        +Tolerations []corev1.Toleration
    }
    WorkMachine *-- WorkMachineSpec
    WorkMachineSpec *-- WorkMachineJobParams
    WorkMachineSpec *-- AWSMachineConfig
Loading

Class Diagram for New Workspace CRD

classDiagram
    class Workspace {
        +Spec WorkspaceSpec
        +Status common_types.Status
    }
    class WorkspaceSpec {
        +WorkMachine string
        +ServiceAccountName string
        +State crdsv1.WorkspaceState
        +EnableTTYD bool
        +EnableJupyterNotebook bool
        +EnableCodeServer bool
        +EnableVSCodeServer bool
        +ImagePullPolicy string
    }
    Workspace *-- WorkspaceSpec
Loading

Class Diagram for Modified RouterReconciler

classDiagram
  class RouterReconciler {
    +Reconcile(ctx context.Context, request ctrl.Request) (ctrl.Result, error)
    +findClusterIssuer(req *reconciler.Request~*crdsv1.Router~) (*certmanagerv1.ClusterIssuer, error)
    +findIngressClass(req *reconciler.Request~*crdsv1.Router~) (string, error)
    +groupHostsByKind(issuer *certmanagerv1.ClusterIssuer, obj *crdsv1.Router) (wildcardHosts []string, nonWildcardHosts []string)
    +reconBasicAuth(req *reconciler.Request~*crdsv1.Router~) stepResult.Result
    +ensureIngresses(req *reconciler.Request~*crdsv1.Router~) stepResult.Result
    - Yanked patchDefaults()
    - Commented EnsuringHttpsCerts()
  }
  note for RouterReconciler "Reconcile method logic changed, new checklist: CreateIngressResource"
Loading

Class Diagram for New WorkMachineReconciler

classDiagram
  class WorkMachineReconciler {
    +Client client.Client
    +Scheme *runtime.Scheme
    +Env *env.Env
    +YAMLClient kubectl.YAMLClient
    +Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error)
    -finalize(req *rApi.Request~crdsv1.WorkMachine~) step_result.Result
    -parseSpecIntoTFValues(ctx context.Context, obj *crdsv1.WorkMachine) ([]byte, error)
    -createWorkMachineCreationJob(req *rApi.Request~crdsv1.WorkMachine~) step_result.Result
    -createTargetNamespace(req *rApi.Request~crdsv1.WorkMachine~) step_result.Result
    -createSSHPublicKeysSecret(req *rApi.Request~crdsv1.WorkMachine~) step_result.Result
    +SetupWithManager(mgr ctrl.Manager) error
  }
Loading

Class Diagram for New WorkspaceReconciler

classDiagram
  class WorkspaceReconciler {
    +Client client.Client
    +Scheme *runtime.Scheme
    +Env *env.Env
    +YAMLClient kubectl.YAMLClient
    +Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error)
    -createDeployment(req *rApi.Request~crdsv1.Workspace~) stepResult.Result
    -finalize(req *rApi.Request~crdsv1.Workspace~) stepResult.Result
    +SetupWithManager(mgr ctrl.Manager) error
  }
Loading

File-Level Changes

Change Details Files
Introduce Workspace and WorkMachine features
  • Add new CRDs, types and deepcopy methods
  • Implement controllers with reconcile steps, finalizers, owned resource cleanup
  • Provide embedded templates for deployments, services, ingress, and lifecycle jobs
  • Define environment config and registration into the operator
  • Scaffold RBAC roles, samples, kustomize manifests
operators/workspace/internal/...
operators/workmachine/internal/...
apis/crds/v1/{workspace,workmachine}_types.go
config/crd/bases/crds.kloudlite.io_{workspaces,workmachines}.yaml
config/samples/*
config/rbac/crds_*
Refactor Router controller and checklist
  • Inline and simplify reconciliation checklist
  • Remove old patchDefaults and Https cert steps, add findClusterIssuer/findIngressClass utilities
  • Group hosts by wildcard vs non-wildcard in one helper
  • Merge reconBasicAuth and ensureIngresses around new CreateIngressResource step
  • Update template invocation and drop unused code
operators/routers/internal/router-controller/controller.go
operators/routers/internal/router-controller/helpers.go
operators/routers/internal/templates/*
config/crd/bases/crds.kloudlite.io_routers.yaml
Migrate logging and request handling
  • Switch from slog to go.pkgs/log in request and YAML client
  • Rename checkWrapper to CheckWrapper and log caller frames
  • Use log.DefaultLogger().SkipFrames for contextual logging
  • Update ReconcileFilter to ignore extra annotations
  • Unify NewRunningCheck signature
toolkit/reconciler/request.go
toolkit/reconciler/checks.go
toolkit/kubectl/yaml-client.go
toolkit/reconciler/event-predicate.go
Enhance YAML client and templating support
  • Improve ApplyYAML error paths and annotation/label merge logic
  • Change template loader from ReadIngressTemplate to generic Read with embed.FS
  • Remove deprecated slog logger fields
  • Adjust controller-runtime ApplyYAML usage arguments
toolkit/kubectl/yaml-client.go
operators/routers/internal/templates/embed.go
operators/routers/internal/templates/ingress-resource.yml.tpl
Bump dependencies and update CRD schemas
  • Upgrade controller-runtime, k8s/api libs and json-patch
  • Remove obsolete fields (domains) and add nginxIngressAnnotations
  • Change Route schema keys from app to host/service
  • Regenerate deepcopy code for AWSMachineConfig and new types
go.mod
apis/crds/v1/zz_generated.deepcopy.go
config/crd/bases/*
config/samples/*

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @nxtcoder17 - I've reviewed your changes - here's some feedback:

  • There’s a lot of commented-out and duplicated logic in the router controller (e.g. commented EnsureHttpsCerts and inline patchDefaults removal) – please clean up or remove unused code blocks to keep the reconciler focused and maintainable.
  • Several EnsureCheckList calls and constants like CreateIngressResource are defined inline in Reconcile; extracting those lists and check names into top‐level variables or constants would reduce noise and improve readability.
  • In operators/workspace/main.go you never call your controller’s RegisterInto – the workspace controller isn’t actually registered with the manager, so reconciliation won’t run; please add the registration there.
Here's what I looked at during the review
  • 🟡 General issues: 4 issues found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

}

if *oldRes.Status.IsReady != *newRes.Status.IsReady {
fireEvent(newObj, ReasonStatusIsReadyChanged, fmt.Sprintf("resource isReady changed from (%v) to (%v)", newRes.Status.IsReady, oldRes.Status.IsReady))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Event predicate now logs the new value as old and vice versa.

Swap the order of newRes.Status.IsReady and oldRes.Status.IsReady in the log message to accurately reflect the value change.

Comment on lines 14 to 16
WorkspaceImageCodeServer string `env:"WORKSPACE_IMAGE_CODE_SERVER" default:"ghcr.io/kloudlite/iac/code-server:latest"`
WorkspcaeImageVscodeServer string `env:"WORKSPCAE_IMAGE_VSCODE_SERVER" default:"ghcr.io/kloudlite/iac/vscode-server:latest"`
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (typo): Typo in field name and env tag for VSCode server image

Rename the field to 'WorkspaceImageVscodeServer' and update the env tag to 'WORKSPACE_IMAGE_VSCODE_SERVER'.

Suggested change
WorkspaceImageCodeServer string `env:"WORKSPACE_IMAGE_CODE_SERVER" default:"ghcr.io/kloudlite/iac/code-server:latest"`
WorkspcaeImageVscodeServer string `env:"WORKSPCAE_IMAGE_VSCODE_SERVER" default:"ghcr.io/kloudlite/iac/vscode-server:latest"`
}
WorkspaceImageCodeServer string `env:"WORKSPACE_IMAGE_CODE_SERVER" default:"ghcr.io/kloudlite/iac/code-server:latest"`
WorkspaceImageVscodeServer string `env:"WORKSPACE_IMAGE_VSCODE_SERVER" default:"ghcr.io/kloudlite/iac/vscode-server:latest"`
}

number: {{$route.Port}}

path: {{ if not hasPrefix "/" $route.Path }}/{{end}}{{$route.Path}}
({{if hasPrefix "/" $route.Path }}{{substr 1 $x $route.Path}}{{else}}{{$route.Path}}{{end}}.*)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Invalid YAML line in ingress template

The parentheses line is not valid YAML. Please remove it or incorporate its logic into a valid field such as path or regex.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ingress resource v2 path is not used actually

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the context—since that “(…).*” line isn’t actually used and, as Sourcery flagged, isn’t valid YAML, the simplest fix is to drop it entirely. If we later decide we need regex-style matching:

  1. You’ll need to add the appropriate nginx ingress annotation (e.g. nginx.ingress.kubernetes.io/use-regex: "true")
  2. Move the regex into the path field itself (K8s’ native Prefix pathType won’t accept the .* syntax).

But if we don’t need regex today, let’s just remove the parentheses line and keep the template strictly valid YAML. Let me know if you’d rather implement the regex route now or defer it.


func (jr *JobTracker) HasJobFinished() bool {
for _, v := range jr.job.Status.Conditions {
if v.Type == batchv1.JobComplete && v.Status == "True" {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Use constant for condition status

Use corev1.ConditionTrue (or metav1.ConditionTrue) instead of the string literal to improve clarity and reduce the risk of typos.

ann[rApi.LastAppliedKey] = string(b)

// Check if the resource exists
cobj, err := resourceClient.Get(ctx, obj.GetName(), metav1.GetOptions{})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider refactoring the resource apply logic to use early exits for errors and a helper function for metadata merging.

Here’s one way to collapse that tangled if err logic into a straightforward “get → create or update” flow and pull your annotation/label merging out into a small helper. You’ll end up with:

  1. A single early‐exit for IsNotFound
  2. A single early‐exit for any other err
  3. A clear “update” block that just calls your helper and then Update
for _, obj := range objs {
    // … marshal JSON and set LastApplied in ann …

    cobj, err := resourceClient.Get(ctx, obj.GetName(), metav1.GetOptions{})
    if apiErrors.IsNotFound(err) {
        // create
        obj.SetAnnotations(ann)
        obj.SetLabels(labels)
        if _, err := resourceClient.Create(ctx, &obj, metav1.CreateOptions{}); err != nil {
            return resources, errors.NewEf(err, "resource: %s/%s", obj.GetNamespace(), obj.GetName())
        }
        logger.Info("created resource")
        continue
    }
    if err != nil {
        return nil, err
    }

    // update
    mergeMeta(&obj, cobj, ann, labels)
    obj.SetAnnotations(ann)
    obj.SetLabels(labels)
    if _, err := resourceClient.Update(ctx, &obj, metav1.UpdateOptions{}); err != nil {
        return resources, errors.NewEf(err, "resource: %s/%s", obj.GetNamespace(), obj.GetName())
    }
    logger.Info("updated resource")
}

And the helper could live just above:

// mergeMeta munges annotations+labels from the existing object and the last-applied snapshot
func mergeMeta(obj, existing *unstructured.Unstructured, ann, labels map[string]string) {
    prev, ok := existing.GetAnnotations()[rApi.LastAppliedKey]
    if ok && prev == ann[rApi.LastAppliedKey] {
        return // nothing to do
    }

    // unmarshal the previous-applied to pick up its ann/labels
    var prevObj unstructured.Unstructured
    if err := json.Unmarshal([]byte(prev), &prevObj); err != nil {
        return // or log if you prefer
    }

    // preserve any keys on existing that neither the new nor the prev-applied annotations had
    for k, v := range existing.GetAnnotations() {
        if !fn.MapHasKey(ann, k) && !fn.MapHasKey(prevObj.GetAnnotations(), k) {
            ann[k] = v
        }
    }
    // same for labels
    for k, v := range existing.GetLabels() {
        if !fn.MapHasKey(labels, k) && !fn.MapHasKey(prevObj.GetLabels(), k) {
            labels[k] = v
        }
    }

    // preserve kubernetes metadata fields (uid, resourceVersion…)
    obj.Object["metadata"] = existing.Object["metadata"]
}

This:

  • Flattens your error‐handling to two simple branches
  • Removes the dual err+cobj tracking
  • Pushes the verbose merge logic into one short, focused function
  • Leaves behavior exactly the same.

@nxtcoder17 nxtcoder17 merged commit 51f2913 into release-v1.1.6 Jun 4, 2025
14 checks passed
@nxtcoder17 nxtcoder17 deleted the feat/workspace-and-workmachine branch June 4, 2025 19:01
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants