-
-
Notifications
You must be signed in to change notification settings - Fork 195
Description
Issue: kube_scheduler_args with --config causes K3s API server crash
Summary
Using kube_scheduler_args with --config=/path/to/scheduler-config.yaml causes the K3s API server to crash immediately after installation, resulting in "connection refused" errors.
Environment
- hetzner-k3s version: 2.4.3
- K3s version: v1.31.4+k3s1
- Instance type: cpx11 (Hetzner Cloud)
- OS: Debian 12
- CNI: Cilium
Configuration
additional_pre_k3s_commands:
- apt update && apt upgrade -y
- |
mkdir -p /etc/rancher/k3s
cat > /etc/rancher/k3s/scheduler-config.yaml << 'SCHED_EOF'
apiVersion: kubescheduler.config.k8s.io/v1
kind: KubeSchedulerConfiguration
profiles:
- schedulerName: default-scheduler
plugins:
score:
enabled:
- name: NodeResourcesFit
weight: 1
pluginConfig:
- name: NodeResourcesFit
args:
scoringStrategy:
type: MostAllocated
resources:
- name: cpu
weight: 1
- name: memory
weight: 1
SCHED_EOF
kube_scheduler_args:
- --config=/etc/rancher/k3s/scheduler-config.yamlExpected Behavior
K3s should start with the custom scheduler configuration, enabling bin packing (MostAllocated scoring strategy) for workload placement.
Actual Behavior
- K3s installs successfully
- Master validation passes: "Master validation successful"
- K3s API server becomes unreachable immediately after
- Cilium installation fails with connection refused errors
Error Output
[Instance binpack-test-master1] k3s installation completed successfully
[Instance binpack-test-master1] Waiting for the control plane to be ready...
[Control plane] Generating the kubeconfig file to ~/.kube/binpack-test-config...
[Control plane] Switched to context "binpack-test-master1".
[Control plane] ...kubeconfig file generated as ~/.kube/binpack-test-config.
[Instance binpack-test-master1] Validating master setup...
[Master Validation] ✅ Master validation successful
[CNI] Installing Cilium...
[CNI] E1214 23:30:57.039949 29893 memcache.go:265] "Unhandled Error" err="couldn't get current server API group list: Get \"https://lb-ip:6443/api?timeout=32s\": dial tcp lb-ip:6443: connect: connection refused - error from a previous attempt: read tcp internal-ip:52105->lb-ip:6443: read: connection reset by peer" logger="UnhandledError"
[CNI] Error: could not get apiVersions from Kubernetes: could not get apiVersions from Kubernetes: Get "https://lb-ip:6443/api?timeout=32s": dial tcp lb-ip:6443: connect: connection refused
Root Cause Analysis
K3s runs all control plane components (API server, scheduler, controller-manager) in a single process. When the scheduler fails to start due to the custom configuration, it appears to crash the entire K3s process.
This could be because:
- K3s's embedded scheduler doesn't support the
--configflag the same way as standalone kube-scheduler - The scheduler configuration file path isn't accessible at the time K3s tries to read it
- There's a version mismatch between the KubeSchedulerConfiguration API version and what K3s expects
Workaround
Currently none. Users needing bin packing must implement it at the application level or use external tools.
Possible Solutions
- Documentation: Document that
kube_scheduler_argswith--configis not supported for K3s - Validation: Add validation to warn users when they try to use
--configwithkube_scheduler_args - Native support: If K3s supports scheduler profiles via a different mechanism, expose that through hetzner-k3s configuration
Reproduction Steps
- Create a minimal cluster config with the
kube_scheduler_argsandadditional_pre_k3s_commandsshown above - Run
hetzner-k3s create --config <config-file> - Observe that cluster creation fails during CNI installation with connection refused errors
Additional Context
The goal was to enable bin packing strategy to consolidate workloads onto fewer nodes for cost optimization on Hetzner Cloud.