Skip to content

feat: kube_scheduler_args #694

@leblancmeneses

Description

@leblancmeneses

Issue: kube_scheduler_args with --config causes K3s API server crash

Summary

Using kube_scheduler_args with --config=/path/to/scheduler-config.yaml causes the K3s API server to crash immediately after installation, resulting in "connection refused" errors.

Environment

  • hetzner-k3s version: 2.4.3
  • K3s version: v1.31.4+k3s1
  • Instance type: cpx11 (Hetzner Cloud)
  • OS: Debian 12
  • CNI: Cilium

Configuration

additional_pre_k3s_commands:
  - apt update && apt upgrade -y
  - |
    mkdir -p /etc/rancher/k3s
    cat > /etc/rancher/k3s/scheduler-config.yaml << 'SCHED_EOF'
    apiVersion: kubescheduler.config.k8s.io/v1
    kind: KubeSchedulerConfiguration
    profiles:
      - schedulerName: default-scheduler
        plugins:
          score:
            enabled:
              - name: NodeResourcesFit
                weight: 1
        pluginConfig:
          - name: NodeResourcesFit
            args:
              scoringStrategy:
                type: MostAllocated
                resources:
                  - name: cpu
                    weight: 1
                  - name: memory
                    weight: 1
    SCHED_EOF

kube_scheduler_args:
  - --config=/etc/rancher/k3s/scheduler-config.yaml

Expected Behavior

K3s should start with the custom scheduler configuration, enabling bin packing (MostAllocated scoring strategy) for workload placement.

Actual Behavior

  1. K3s installs successfully
  2. Master validation passes: "Master validation successful"
  3. K3s API server becomes unreachable immediately after
  4. Cilium installation fails with connection refused errors

Error Output

[Instance binpack-test-master1] k3s installation completed successfully
[Instance binpack-test-master1] Waiting for the control plane to be ready...
[Control plane] Generating the kubeconfig file to ~/.kube/binpack-test-config...
[Control plane] Switched to context "binpack-test-master1".
[Control plane] ...kubeconfig file generated as ~/.kube/binpack-test-config.
[Instance binpack-test-master1] Validating master setup...
[Master Validation] ✅ Master validation successful
[CNI] Installing Cilium...
[CNI] E1214 23:30:57.039949   29893 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://lb-ip:6443/api?timeout=32s\": dial tcp lb-ip:6443: connect: connection refused - error from a previous attempt: read tcp internal-ip:52105->lb-ip:6443: read: connection reset by peer" logger="UnhandledError"
[CNI] Error: could not get apiVersions from Kubernetes: could not get apiVersions from Kubernetes: Get "https://lb-ip:6443/api?timeout=32s": dial tcp lb-ip:6443: connect: connection refused

Root Cause Analysis

K3s runs all control plane components (API server, scheduler, controller-manager) in a single process. When the scheduler fails to start due to the custom configuration, it appears to crash the entire K3s process.

This could be because:

  1. K3s's embedded scheduler doesn't support the --config flag the same way as standalone kube-scheduler
  2. The scheduler configuration file path isn't accessible at the time K3s tries to read it
  3. There's a version mismatch between the KubeSchedulerConfiguration API version and what K3s expects

Workaround

Currently none. Users needing bin packing must implement it at the application level or use external tools.

Possible Solutions

  1. Documentation: Document that kube_scheduler_args with --config is not supported for K3s
  2. Validation: Add validation to warn users when they try to use --config with kube_scheduler_args
  3. Native support: If K3s supports scheduler profiles via a different mechanism, expose that through hetzner-k3s configuration

Reproduction Steps

  1. Create a minimal cluster config with the kube_scheduler_args and additional_pre_k3s_commands shown above
  2. Run hetzner-k3s create --config <config-file>
  3. Observe that cluster creation fails during CNI installation with connection refused errors

Additional Context

The goal was to enable bin packing strategy to consolidate workloads onto fewer nodes for cost optimization on Hetzner Cloud.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions