Skip to content

Conversation

@goldberl
Copy link
Contributor

@goldberl goldberl commented Feb 10, 2026

Proposed Commit Message

fix (GCE): wait for NIC before fetching metadata

On C3-metal GCE instances, cloud-init runs before the network
interface is ready, causing datasource detection to fail and
preventing SSH access.

This change adds a polling loop for metadata fetch that waits
for a usable NIC and metadata service.Fixes boot issues on GCE
instances without breaking existing unit tests.

Fixes GH-6737

Additional Context

Test Steps

  1. Launch a C3 instance on Google Cloud
gcloud compute instances create goldberl-c3-test-01 \
  --project=ubuntu-os-support \
  --zone=europe-west1-c \
  --machine-type=c3-standard-192-metal \
  --maintenance-policy=TERMINATE \
  --provisioning-model=STANDARD \
  --image=projects/ubuntu-os-support/global/images/melissa-questing-serial-eu \
  --boot-disk-size=10GB \
  --boot-disk-auto-delete
  1. Wait several minutes until the machine is ready, then check cloud-init status --long
    You will see something like this:
ubuntu@goldberl-c3-test-01:~$ cloud-init status --long
status: error
extended_status: error - done
boot_status_code: enabled-by-generator
last_update: Thu, 01 Jan 1970 00:00:28 +0000
detail: Cloud-init enabled by systemd cloud-init-generator
errors:
	- No instance datasource found.
	- Can not apply stage config, no datasource found! Likely bad things to come!
	- Can not apply stage final, no datasource found! Likely bad things to come!
recoverable_errors:
WARNING:
	- Getting data from <class 'cloudinit.sources.DataSourceGCE.DataSourceGCELocal'> failed
	- No instance datasource found! Likely bad things to come!
	- required key instance-id returned nothing. not GCE
	- Getting data from <class 'cloudinit.sources.DataSourceGCE.DataSourceGCELocal'> failed
	- Can not apply stage config, no datasource found! Likely bad things to come!
	- Can not apply stage final, no datasource found! Likely bad things to come!
  1. Apply this patch to the /usr/lib/python3/dist-packages/cloudinit/sources/DataSourceGCE.py file
  2. Clean cloud-init logs with sudo cloud-init clean --logs --config all and then reboot machine
  3. SSH into the machine after the reboot, and see the status again
ubuntu@goldberl-c3-test-01:~$ cloud-init status --long
status: done
extended_status: degraded done
boot_status_code: enabled-by-generator
last_update: Thu, 01 Jan 1970 00:00:42 +0000
detail: DataSourceGCELocal
errors: []
recoverable_errors:
WARNING:
	- Publishing host keys failed!
	- Publishing host keys failed!

And you can check the patch was executed via the following commands:

sudo cat /var/log/cloud-init.log | grep "No NICs yet"
sudo cat /var/log/cloud-init.log | grep "Eligible NICs found"

You will see soemthing like this

2026-02-10 21:22:46,296 - DataSourceGCE.py[DEBUG]: No NICs yet, waiting for udev/network...
2026-02-10 21:22:53,098 - DataSourceGCE.py[DEBUG]: Eligible NICs found after 6802 ms: ['enp5s0f0']

The DataSourceGCELocal is properly detected this time.

Although we still see warnings about publishing host keys failing. This may be a different issue.

Merge type

  • Squash merge using "Proposed Commit Message"
  • Rebase and merge unique commits. Requires commit messages per-commit each referencing the pull request number (#<PR_NUM>)

On C3-metal GCE instances, cloud-init runs before the network
interface is ready, causing datasource detection to fail and
preventing SSH access.

This change adds a polling loop for metadata fetch that waits
for a usable NIC and metadata service.Fixes boot issues on
GCE instances without breaking existing unit tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant