-
Notifications
You must be signed in to change notification settings - Fork 215
Description
What happened?
EFS mounts fail on EKS 1.34 with Bottlerocket nodes. The efs-proxy component starts but immediately panics when trying to bind to localhost, causing mount failures with DeadlineExceeded errors.
When manually running efs-proxy inside the CSI node container, it crashes with:
thread 'main' (127) panicked at src/controller.rs:89:13:
Failed to bind 127.0.0.1:20381
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
The mount.log shows:
2026-01-14 16:12:01 UTC - INFO - Starting efs-proxy: "/sbin/efs-proxy /var/run/efs/stunnel-config... --tls"
2026-01-14 16:12:01 UTC - INFO - Started efs-proxy, pid: 26
2026-01-14 16:12:01 UTC - WARNING - Error connecting to 127.0.0.1:20241, [Errno 111] Connection refused
2026-01-14 16:12:16 UTC - ERROR - Mounting ... failed due to timeout after 15 sec
What you expected to happen?
EFS should mount successfully. The efs-proxy should be able to bind to localhost and proxy NFS traffic over TLS.
How to reproduce it (as minimally and precisely as possible)?
- Deploy EKS 1.34 cluster with Bottlerocket nodes (tested with Bottlerocket OS 1.52.0)
- Install EFS CSI driver v2.2.0 as EKS add-on
- Create EFS with mount targets in node subnets
- Create StorageClass, PVC, and Pod:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: efs-sc
provisioner: efs.csi.aws.com
parameters:
provisioningMode: efs-ap
fileSystemId: fs-XXXXXXXXX
directoryPerms: "755"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: efs-pvc
spec:
accessModes:
- ReadWriteMany
storageClassName: efs-sc
resources:
requests:
storage: 5Gi
---
apiVersion: v1
kind: Pod
metadata:
name: efs-test
spec:
containers:
- name: app
image: amazonlinux:2023
command: ["/bin/sh", "-c", "sleep 3600"]
volumeMounts:
- name: efs
mountPath: /data
volumes:
- name: efs
persistentVolumeClaim:
claimName: efs-pvc- Pod stays in
ContainerCreatingwithFailedMountevents
Anything else we need to know?
-
Plain NFS mount works (without TLS/efs-proxy):
mount -t nfs4 -o nfsvers=4.1 fs-xxx.efs.region.amazonaws.com:/ /mnt/efs
-
TLS connection to EFS works (tested with openssl s_client)
-
The CSI node container is properly configured:
privileged: truehostNetwork: true- Loopback interface exists and has traffic
-
The stunnel config generated includes
socket = a:SO_BINDTODEVICE=lowhich may have issues in containerized environments -
Tested with both encrypted (KMS) and unencrypted EFS - same failure
-
Also reproduced on Amazon Linux 2023 EC2 instance with efs-utils v2.4.1
-
Warning in logs:
Could not start amazon-efs-mount-watchdog, unrecognized init system "aws-efs-csi-dri"
Environment
- Kubernetes version (use
kubectl version): v1.34 (EKS platform eks.9) - Driver version: v2.2.0-eksbuild.1 (EKS Add-on)
- Node OS: Bottlerocket OS 1.52.0 (aws-k8s-1.34)
- Kernel: 6.12.58
- Container Runtime: containerd://2.1.5+bottlerocket
- Region: eu-central-1
Please also attach debug logs to help us better diagnose
EFS CSI Node Pod logs:
I0114 16:09:14.552835 1 config_dir.go:88] Creating symlink from '/etc/amazon/efs' to '/var/amazon/efs'
I0114 16:09:14.568524 1 driver.go:131] Registering Node Server
I0114 16:09:14.568583 1 driver.go:133] Registering Controller Server
I0114 16:09:14.568623 1 driver.go:136] Starting efs-utils watchdog
I0114 16:09:14.569367 1 driver.go:151] Listening for connections on address: &net.UnixAddr{Name:"/csi/csi.sock", Net:"unix"}
I0114 16:12:00.975226 1 mount_linux.go:285] Detected OS without systemd
Mount log from inside container (/var/log/amazon/efs/mount.log):
2026-01-14 16:12:01 UTC - INFO - version=2.4.1 options={'rw': None, 'accesspoint': 'fsap-xxx', 'tls': None}
2026-01-14 16:12:01 UTC - INFO - binding 20241
2026-01-14 16:12:01 UTC - WARNING - Could not start amazon-efs-mount-watchdog, unrecognized init system "aws-efs-csi-dri"
2026-01-14 16:12:01 UTC - INFO - Starting efs-proxy: "/sbin/efs-proxy /var/run/efs/stunnel-config.fs-xxx... --tls"
2026-01-14 16:12:01 UTC - INFO - Started efs-proxy, pid: 26
2026-01-14 16:12:01 UTC - WARNING - Error connecting to 127.0.0.1:20241, [Errno 111] Connection refused
2026-01-14 16:12:01 UTC - INFO - Executing: "/sbin/mount.nfs4 127.0.0.1:/ /var/lib/kubelet/pods/.../mount -o rw,nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2,noresvport,port=20241" with 15 sec time limit.
2026-01-14 16:12:16 UTC - ERROR - Mounting fs-xxx.efs.eu-central-1.amazonaws.com to ... failed due to timeout after 15 sec, mount attempt 1/3
Manual efs-proxy execution (from inside CSI node container):
$ /sbin/efs-proxy "/var/run/efs/stunnel-config.fs-xxx..." --tls
thread 'main' (127) panicked at src/controller.rs:89:13:
Failed to bind 127.0.0.1:20381
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Stunnel config content:
fips = no
foreground = yes
socket = l:SO_REUSEADDR=yes
socket = a:SO_BINDTODEVICE=lo
pid = /var/run/efs/.../stunnel.pid
[efs]
client = yes
accept = 127.0.0.1:20381
connect = fs-xxx.efs.eu-central-1.amazonaws.com:2049
sslVersion = TLSv1.2
renegotiation = no
TIMEOUTbusy = 20
TIMEOUTclose = 0
TIMEOUTidle = 70
delay = yes
verify = 2
CAfile = /etc/amazon/efs/efs-utils.crt
cert = /var/run/efs/.../certificate.pem
key = /etc/amazon/efs/privateKey.pem
checkHost = fs-xxx.efs.eu-central-1.amazonaws.com