Skip to content

weller setup friction log. #263

@patflynn

Description

@patflynn

This was a rough install, effectively hitting every classic NixOS "chicken-and-egg" problem plus a critical hardware failure.

Here is a Friction Log of the entire process, broken down by failure point, followed by a Proposed "Golden Path" Workflow to fix this for future machines.

The Friction Log: "Weller" Install (Feb 2026)

Phase Failure Root Cause The Workaround Used
1. Disk Formatting Drive Hang / Corruption Hardware: Seagate FireCuda 510 firmware crashes when Linux uses APST power saving. Manual Kernel Patch: Forced nvme_core.default_ps_max_latency_us=0 via sysfs before install.
2. Installation Secret Decryption Failed Bootstrap Paradox: Agenix tried to decrypt secrets for weller using a host SSH key that didn't exist yet. Ignored Errors: Installation completed with broken secrets (red text warnings).
3. First Boot Locked Out (Root & User) Config Trap: mutableUsers = false reset the root password to "null" because no secret was available. Recovery Mode: Booted installer, mounted disk, edited users.nix to mutableUsers = true, reset password manually.
4. User Login Authentication Failed Rekeying Gap: The repo didn't know weller's new SSH key, so user-password.age couldn't be decrypted. The "Dance": Manually copied key to laptop, updated secrets.nix, rekeyed, pushed, pulled on Weller.
5. Password File Hash Mismatch Format Error: hashedPasswordFile received a plaintext password instead of a SHA-512 hash. Manual Override: Ran passwd patrick as root to bypass the immutable file system temporarily.
6. GPU Drivers Black Screen / No Signal Config Gap: Nvidia proprietary drivers require modesetting.enable = true on modern setups, which was missing. TTY Rescue: Switched to TTY, edited config blindly to add modesetting.

The Fix: A Better Bootstrap Protocol

To avoid this pain next time, I recommend restructuring your repo and documentation to support a "Two-Stage Install."

1. The "Bootstrap" Configuration

Add a special bootstrap module to your flake. This config is designed only to get the machine online and accessible.

  • modules/bootstrap.nix:
    • Sets users.mutableUsers = true (allows passwd to work).
    • Disables agenix / secrets (prevents decryption errors).
    • Enables sshd with a known public key (your laptop's).
    • Includes the generic hardware fix: boot.kernelParams = [ "nvme_core.default_ps_max_latency_us=0" ]; (Safety net for NVMe drives).

2. The New Workflow (Draft for Docs)

Step 1: The "Key Gen" (Pre-Install)

Instead of letting the installer generate the key effectively "too late," generate it first in the live environment.

Bash

# In the Live ISO/Netboot terminal:
sudo ssh-keygen -t ed25519 -f /tmp/ssh_host_ed25519_key -N ""
cat /tmp/ssh_host_ed25519_key.pub

  • Action: Copy this key to secrets/keys.nix on your laptop, rekey, and push BEFORE you even start the install.

Step 2: The Install (With Keys)

Copy that generated key to the target mount so the new OS uses it immediately.

Bash

# Mount partitions
sudo mount /dev/disk/by-label/nixos /mnt

# Manually place the key
sudo mkdir -p /mnt/etc/ssh
sudo cp /tmp/ssh_host_ed25519_key /mnt/etc/ssh/
sudo cp /tmp/ssh_host_ed25519_key.pub /mnt/etc/ssh/

# Install
nixos-install --flake .#weller

Step 3: First Boot (Success)

Because you placed the key before the install, Agenix will successfully decrypt user-password.age on the very first boot. No lockouts, no manual passwd, no rekeying dance required later.

Next Step

Would you like me to draft the modules/bootstrap.nix file for your repo so you have it ready for the next machine?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions