diff --git a/DOCS.md b/DOCS.md index b6ed833..ecf4c6c 100644 --- a/DOCS.md +++ b/DOCS.md @@ -7,7 +7,7 @@ Complete guide to this NixOS configuration. Start here for navigation. **New to this config?** Start here: 1. Read [README.md](README.md) - Project overview and quick start 2. Check [FLEET-MANAGEMENT.md](FLEET-MANAGEMENT.md) - How to deploy systems -3. Review [docs/SECURITY.md](docs/SECURITY.md) - Security baseline +3. Review [docs/security/SECURITY.md](docs/security/SECURITY.md) - Security baseline ## πŸ“š Core Documentation @@ -19,7 +19,7 @@ Complete guide to this NixOS configuration. Start here for navigation. | [FLEET-MANAGEMENT.md](FLEET-MANAGEMENT.md) | Deploy and manage multiple NixOS systems | Initial deployment, routine updates | | [docs/BOOTSTRAP.md](docs/BOOTSTRAP.md) | Bootstrap new NixOS systems from scratch | Installing NixOS on new hardware | | [systems/cortex/AI-SERVICES.md](systems/cortex/AI-SERVICES.md) | AI/LLM infrastructure on Cortex (Ollama, NVIDIA) | Using AI services, GPU troubleshooting | -| [docs/SECURITY.md](docs/SECURITY.md) | Security configuration (fail2ban, auditd, SSH) | Hardening systems, security audit | +| [docs/security/SECURITY.md](docs/security/SECURITY.md) | Security configuration (fail2ban, auditd, SSH) | Hardening systems, security audit | | [SECRETS.md](SECRETS.md) | Complete secrets management guide with sops-nix | Managing passwords, API keys, certificates | ## οΏ½ Planning & Implementation @@ -29,7 +29,7 @@ Complete guide to this NixOS configuration. Start here for navigation. | [docs/PROJECT-OVERVIEW.md](docs/PROJECT-OVERVIEW.md) | Architecture, philosophy, system details | Understanding design decisions | | [docs/ROADMAP.md](docs/ROADMAP.md) | Implementation roadmap and timeline | Planning next features | | [docs/IMPLEMENTATION-GUIDE.md](docs/IMPLEMENTATION-GUIDE.md) | Step-by-step implementation guides | Adding new features | -| [docs/TODO-CHECKLIST.md](docs/TODO-CHECKLIST.md) | Progress tracking and status | Tracking what's done | +| [docs/planning/TODO-CHECKLIST.md](docs/planning/TODO-CHECKLIST.md) | Progress tracking and status | Tracking what's done | ## οΏ½πŸ› οΈ Troubleshooting & Reference @@ -95,11 +95,18 @@ When updating documentation: β”‚ β”œβ”€β”€ PROJECT-OVERVIEW.md # Architecture & philosophy β”‚ β”œβ”€β”€ ROADMAP.md # Implementation timeline β”‚ β”œβ”€β”€ IMPLEMENTATION-GUIDE.md # Step-by-step guides -β”‚ β”œβ”€β”€ TODO-CHECKLIST.md # Progress tracking β”‚ β”œβ”€β”€ ARCHITECTURE.md # Module system docs -β”‚ β”œβ”€β”€ SECURITY.md # Security configuration β”‚ β”œβ”€β”€ BOOTSTRAP.md # Bootstrap new systems β”‚ β”œβ”€β”€ FLEET-FUTURE.md # Future fleet tools (Colmena) +β”‚ β”œβ”€β”€ planning/ # Planning & tracking docs +β”‚ β”‚ β”œβ”€β”€ TODO-CHECKLIST.md # Progress tracking +β”‚ β”‚ β”œβ”€β”€ CLEANUP-PRD.md # Cleanup project details +β”‚ β”‚ └── CLEANUP-CHECKLIST.md # Cleanup tasks +β”‚ β”œβ”€β”€ security/ # Security documentation +β”‚ β”‚ β”œβ”€β”€ SECURITY.md # Security configuration +β”‚ β”‚ β”œβ”€β”€ SECURITY-ROADMAP.md # Security roadmap +β”‚ β”‚ β”œβ”€β”€ SECURITY-SCANNING.md # Secret scanning +β”‚ β”‚ └── CORTEX-SECURITY.md # Cortex hardening β”‚ β”œβ”€β”€ troubleshooting/ # Specific issue guides β”‚ └── blog/ # Learning journey posts └── systems/ diff --git a/ISSUES.md b/ISSUES.md index 31aed45..b97d942 100644 --- a/ISSUES.md +++ b/ISSUES.md @@ -143,7 +143,7 @@ - Options: `nixos-rebuild build-vm`, `machinectl`, Proxmox VMs - Related: `machines` NixOS feature, `systemd-nspawn` - Tags: `testing`, `vms`, `dev-environment` - - See: [TODO-CHECKLIST.md](./docs/TODO-CHECKLIST.md) - Testing & Validation section + - See: [TODO-CHECKLIST.md](./docs/planning/TODO-CHECKLIST.md) - Testing & Validation section ### Security Tooling @@ -162,7 +162,7 @@ - Impact: Deep historical secret detection in git history - Solution: Available in devenv (v3.90.9), convenience script created - Usage: `./scripts/security-scan.sh [quick|full|history]` - - Documentation: `docs/SECURITY-SCANNING.md` + - Documentation: `docs/security/SECURITY-SCANNING.md` - Location: `devenv.nix`, `scripts/security-scan.sh` - Tags: `security`, `git`, `secrets` @@ -255,7 +255,7 @@ - Solution: Pre-commit hooks active, convenience script created, comprehensive documentation - Tools: git-secrets (pre-commit) + TruffleHog v3.90.9 (manual/CI) - Scripts: `scripts/security-scan.sh`, `scripts/git-hooks/pre-commit` - - Documentation: `docs/SECURITY-SCANNING.md` + - Documentation: `docs/security/SECURITY-SCANNING.md` - Impact: Prevents accidental secret commits and enables deep historical scanning --- @@ -283,10 +283,10 @@ ### Related Documentation -- [TODO-CHECKLIST.md](./docs/TODO-CHECKLIST.md) - Implementation roadmap +- [TODO-CHECKLIST.md](./docs/planning/TODO-CHECKLIST.md) - Implementation roadmap - [ARCHITECTURE.md](./docs/ARCHITECTURE.md) - System architecture - [FLEET-MANAGEMENT.md](./FLEET-MANAGEMENT.md) - Deployment workflows -- [SECURITY.md](./docs/SECURITY.md) - Security configuration +- [SECURITY.md](./docs/security/SECURITY.md) - Security configuration --- @@ -294,4 +294,4 @@ - Review and update this file weekly - Move stale items to backlog or archive - Keep critical issues visible at the top -- Cross-reference with TODO-CHECKLIST.md for implementation tasks +- Cross-reference with docs/planning/TODO-CHECKLIST.md for implementation tasks diff --git a/README.md b/README.md index 6004932..9836a8c 100644 --- a/README.md +++ b/README.md @@ -16,7 +16,7 @@ Personal NixOS configuration with **unified feature modules** based on the dendr - **[docs/BOOTSTRAP.md](docs/BOOTSTRAP.md)** - Bootstrap new NixOS systems **Security & Secrets:** -- **[docs/SECURITY.md](docs/SECURITY.md)** - Security configuration +- **[docs/security/SECURITY.md](docs/security/SECURITY.md)** - Security configuration - **[SECRETS.md](SECRETS.md)** - Secrets management (sops-nix + age) ## πŸš€ Quick Start @@ -257,5 +257,5 @@ Some extensions are already configured but commented out. To enable them: ## πŸ“‹ Notes - Dotfiles are managed using Home Manager with live-updating symlinks -- Monitor configuration is stored in `monitors.json` +- Monitor configuration is stored in `systems/orion/monitors.json` - Additional notes in `notes.txt` diff --git a/build.log b/build.log deleted file mode 100644 index e8bc64e..0000000 --- a/build.log +++ /dev/null @@ -1,25 +0,0 @@ -building the system configuration... -Using saved setting for 'extra-substituters = https://hyprland.cachix.org https://nix-community.cachix.org' from ~/.local/share/nix/trusted-settings.json. -Using saved setting for 'extra-trusted-public-keys = hyprland.cachix.org-1:a7pgxzMz7+chwVL3/pzj6jIBMioiJM7ypFP8PwtkuGc= nix-community.cachix.org-1:mB9FSh9qf2dCimDSUo8Zy7bkq5CX+/rkCWyvRCYg3Fs=' from ~/.local/share/nix/trusted-settings.json. -these 10 derivations will be built: - /nix/store/gfhh61hnr2yajiny6nxrbh19nri9h00n-home-manager-path.drv - /nix/store/y55al34a1wwra9348v308pxa3x4jg8ic-hm_fontconfigconf.d10hmfonts.conf.drv - /nix/store/mxz5cmr606lchxdvfy58b9d1g2y559hw-home-manager-files.drv - /nix/store/fkwid9qcwi43q62025bbld529h9rmhv1-home-manager-generation.drv - /nix/store/j9afvaasplwbhw784jl4hsimc9fqmika-unit-home-manager-syg.service.drv - /nix/store/jrafwrykf7b6g3bklm8yay5qbfgk762d-system-units.drv - /nix/store/sfhg2qrfv7xjlxv0c7l120638km8s0vq-user-environment.drv - /nix/store/kljcjj82jqsgzcr9a3wg4qgg1a0g52hk-etc.drv - /nix/store/kj56mxfg6ldx6ynd2qrii0kf3kgin0kz-activate.drv - /nix/store/s9ykajjx13xq7awjxab1sx113b6rvhsz-nixos-system-orion-26.05.20260116.e4bae1b.drv -building '/nix/store/gfhh61hnr2yajiny6nxrbh19nri9h00n-home-manager-path.drv'... -building '/nix/store/y55al34a1wwra9348v308pxa3x4jg8ic-hm_fontconfigconf.d10hmfonts.conf.drv'... -building '/nix/store/sfhg2qrfv7xjlxv0c7l120638km8s0vq-user-environment.drv'... -building '/nix/store/mxz5cmr606lchxdvfy58b9d1g2y559hw-home-manager-files.drv'... -building '/nix/store/fkwid9qcwi43q62025bbld529h9rmhv1-home-manager-generation.drv'... -building '/nix/store/j9afvaasplwbhw784jl4hsimc9fqmika-unit-home-manager-syg.service.drv'... -building '/nix/store/jrafwrykf7b6g3bklm8yay5qbfgk762d-system-units.drv'... -building '/nix/store/kljcjj82jqsgzcr9a3wg4qgg1a0g52hk-etc.drv'... -building '/nix/store/kj56mxfg6ldx6ynd2qrii0kf3kgin0kz-activate.drv'... -building '/nix/store/s9ykajjx13xq7awjxab1sx113b6rvhsz-nixos-system-orion-26.05.20260116.e4bae1b.drv'... -Done. The new configuration is /nix/store/pxqcbvj3a6k55la4fmrnpdfbkkq3ij4l-nixos-system-orion-26.05.20260116.e4bae1b diff --git a/docs/BOOTSTRAP.md b/docs/BOOTSTRAP.md index b8a1bf5..72c4578 100644 --- a/docs/BOOTSTRAP.md +++ b/docs/BOOTSTRAP.md @@ -71,10 +71,10 @@ OPTIONAL: ### Centralized Configuration Pattern -All host configurations use a centralized network topology defined in `network-config.nix`: +All host configurations use a centralized network topology defined in `fleet-config.nix`: ```nix -# network-config.nix - Central source of truth for all hosts +# fleet-config.nix - Central source of truth for all hosts { hosts = { orion = { @@ -102,16 +102,16 @@ All host configurations use a centralized network topology defined in `network-c ### System Variables Pattern Each host has a `systems//variables.nix` file that: -- Imports from centralized `network-config.nix` +- Imports from centralized `fleet-config.nix` - Defines machine-specific settings (user preferences, applications, etc.) -- Re-exports network config for convenient access +- Re-exports fleet config for convenient access **Example:** ```nix let - networkConfig = import ../../network-config.nix; - thisHost = networkConfig.hosts.orion; + fleetConfig = import ../../fleet-config.nix; + thisHost = fleetConfig.hosts.orion; in { system = { @@ -389,7 +389,7 @@ The `fleet.sh` script auto-loads host configuration from your Nix config: ### Wake-on-LAN -If configured in `network-config.nix`: +If configured in `fleet-config.nix`: ```bash # Wake a sleeping host @@ -576,7 +576,7 @@ ssh-add -l - Note the IP address 2. **Configure the host** - - Add entry to `network-config.nix` + - Add entry to `fleet-config.nix` - Create `systems//` directory with configuration - Create `systems//variables.nix` - Create `systems//disk-config.nix` @@ -680,6 +680,6 @@ done --- -**Last Updated:** October 29, 2025 +**Last Updated:** January 22, 2026 **Pattern Source:** EmergentMind's nix-config **Status:** Active deployment method for all fleet systems diff --git a/docs/CORTEX-SECURITY.md b/docs/CORTEX-SECURITY.md deleted file mode 100644 index 9f3c556..0000000 --- a/docs/CORTEX-SECURITY.md +++ /dev/null @@ -1,291 +0,0 @@ -# Cortex Security Implementation Guide - -**System:** Cortex AI Server (192.168.1.7) -**User:** jarvis (admin) -**Status:** βœ… Implemented and Active - -This document describes the actual security implementation for the Cortex AI server as configured in `systems/cortex/default.nix`. - -## Security Features Implemented - -### 1. Fail2ban - Automatic IP Blocking -- **Service**: `services.fail2ban.enable = true` -- **Purpose**: Automatically blocks IP addresses that show malicious intent (brute force attacks) -- **Configuration**: - - Default SSH jail enabled - - Whitelisted networks: Localhost (127.0.0.0/8), Private ranges (192.168.0.0/16, 10.0.0.0/8, 172.16.0.0/12) - -### 2. Audit Logging - Security Event Monitoring -- **Services**: - - `security.auditd.enable = true` - Audit daemon - - `security.audit.enable = true` - Audit subsystem -- **Purpose**: Logs security-relevant events for monitoring and compliance -- **Monitored Events**: - - Authentication events (/var/log/auth.log) - - Sudo usage (/etc/sudoers) - - SSH configuration changes (/etc/ssh/sshd_config) - - User/group modifications (/etc/passwd, /etc/group) - - Login/logout events (/var/log/wtmp, /var/log/btmp) - -### 3. SSH Hardening -- **Key-only authentication**: Password authentication completely disabled -- **Root access**: Completely prohibited (PermitRootLogin = "no") -- **User whitelist**: Only `jarvis` user allowed -- **Connection limits**: Maximum 3 authentication attempts, 2 max sessions -- **Session timeouts**: 5-minute idle timeout with 2 keepalive probes -- **Protocol restrictions**: SSH protocol 2 only, no X11 forwarding, no TCP forwarding -- **Additional hardening**: Login grace time 30s, connection rate limiting (3:50:10) - -### 4. Kernel Security Parameters -Network security hardening through sysctl parameters: -- IP forwarding disabled -- ICMP redirects disabled -- Source routing disabled -- SYN flood protection enabled -- TCP SYN/ACK retry limits configured - -### 5. System Monitoring -- **Journald**: Configured with log rotation (1GB max, 10 files, 1-month retention) -- **Time synchronization**: Chronyd enabled for accurate timestamps -- **System hardening**: Various kernel parameters for network security - -## Deployment & Management - -### Deploy Configuration Changes -```bash -# From Orion, rebuild Cortex -cd ~/.config/nixos - -# Option 1: Using deploy-rs (if configured) -nix run github:serokell/deploy-rs -- .#cortex - -# Option 2: Manual deployment -nix build .#nixosConfigurations.cortex.config.system.build.toplevel -# Then copy and activate on Cortex - -# Option 3: On Cortex directly -ssh jarvis@192.168.1.7 -cd /etc/nixos # or wherever config is -sudo nixos-rebuild switch --flake .#cortex -``` - -### Verify Security Configuration -```bash -# Connect to cortex -ssh jarvis@192.168.1.7 - -# Check security services -sudo systemctl status fail2ban auditd - -# Check fail2ban jails -sudo fail2ban-client status - -# Check audit rules -sudo auditctl -l - -# Check firewall rules -sudo iptables -L -n -v -``` - -## Post-Deployment Verification - -### 1. Service Status Check -```bash -ssh jarvis@192.168.1.7 'sudo systemctl status fail2ban auditd' -``` - -Expected output: -- fail2ban.service: active (running) -- auditd.service: active (running) - -### 2. Fail2ban Status -```bash -ssh jarvis@192.168.1.7 'sudo fail2ban-client status' -``` - -Expected output should show active jails (at minimum `sshd`). - -### 3. Audit Rules -```bash -ssh jarvis@192.168.1.7 'sudo auditctl -l' -``` - -Expected output should show audit rules monitoring: -- Authentication logs (/var/log/auth.log) -- Sudo configuration (/etc/sudoers) -- SSH config (/etc/ssh/sshd_config) -- User/group files (/etc/passwd, /etc/group, /etc/shadow) -- Login logs (/var/log/wtmp, /var/log/btmp) -- Service user directories (/var/lib/friday) -- Systemd changes (/etc/systemd/system) - -### 4. SSH Security Test -```bash -# This should fail (password auth disabled) -ssh -o PreferredAuthentications=password jarvis@192.168.1.7 - -# This should work (key auth from authorized workstation) -ssh jarvis@192.168.1.7 - -# Root login should fail completely -ssh root@192.168.1.7 -``` - -### 5. Firewall Verification -```bash -# Check firewall status -ssh jarvis@192.168.1.7 'sudo iptables -L -n -v' - -# Should show: -# - ACCEPT for localhost -# - ACCEPT for established connections -# - ACCEPT for SSH from local networks only (192.168.x.x, 10.x.x.x, 172.16.x.x) -# - LOG and DROP for everything else -``` - -## Troubleshooting - -### Fail2ban Not Starting -1. Check service status: `sudo systemctl status fail2ban` -2. Check logs: `sudo journalctl -u fail2ban -n 50` -3. Verify configuration: `sudo fail2ban-client -t` - -### Audit Service Issues -1. Check service status: `sudo systemctl status auditd` -2. Check logs: `sudo journalctl -u auditd -n 50` -3. Verify rules: `sudo auditctl -l` - -### SSH Connection Issues -1. Verify SSH service: `sudo systemctl status sshd` -2. Check SSH configuration: `sudo sshd -T` -3. Monitor auth logs: `sudo tail -f /var/log/auth.log` - -### Service Missing After Deployment -If services are not running after deployment, this could indicate: -1. Configuration syntax error (check build logs) -2. Service dependency issues (check systemd logs) -3. Package availability issues (services should auto-provide packages) - -The configuration has been updated to address these common issues by: -- Using the standard NixOS service enablement patterns -- Removing redundant package declarations -- Adding explicit jail configuration for fail2ban -- Including comprehensive audit rules - -## Security Monitoring - -### Log Locations -- **Fail2ban logs**: `journalctl -u fail2ban` -- **Audit logs**: `journalctl -u auditd` and `/var/log/audit/audit.log` -- **SSH logs**: `/var/log/auth.log` and `journalctl -u sshd` -- **System logs**: `journalctl -f` - -### Regular Maintenance - -```bash -# Review blocked IPs -sudo fail2ban-client status sshd -sudo fail2ban-client status sshd --details - -# Check recent audit events -sudo ausearch -ts today -sudo ausearch -k auth # Authentication events -sudo ausearch -k sudoers # Sudo usage - -# Monitor system logs -sudo journalctl -f -sudo journalctl -u fail2ban -n 50 -sudo journalctl -u auditd -n 50 - -# Check for failed login attempts -sudo journalctl _SYSTEMD_UNIT=sshd.service | grep -i failed - -# Update system (from Orion workstation) -cd ~/.config/nixos -git pull -nix run github:serokell/deploy-rs -- .#cortex -``` - -## Advanced Configuration - -### Custom Fail2ban Jails -Add additional jails to the configuration in `systems/cortex/default.nix`: -```nix -services.fail2ban.jails = { - nginx-noscript = { - settings = { - enabled = true; - filter = "nginx-noscript"; - logpath = "/var/log/nginx/access.log"; - maxretry = 6; - }; - }; -}; -``` - -### Additional Audit Rules -Add more audit rules to monitor specific files or system calls in `systems/cortex/default.nix`: -```nix -security.audit.rules = [ - # Existing rules... - "-w /etc/crontab -p wa -k cron" - "-w /var/lib/friday -p rwxa -k ai-service-access" -]; -``` - -## Current User Configuration - -### Admin User: jarvis -- **Purpose**: System administrator for Cortex -- **SSH Access**: βœ… Key-only authentication -- **Sudo**: βœ… Full administrative access -- **Home**: /home/jarvis -- **Authorized Key**: syg@orion's SSH key - -### Service User: friday -- **Purpose**: Run AI services (Ollama, Open WebUI) -- **Type**: System user (not login user) -- **SSH Access**: ❌ No SSH access -- **Sudo**: Limited (service management only) -- **Home**: /var/lib/friday -- **Audit**: All access logged - -## Network Security - -### Firewall Configuration -- **Default Policy**: DROP all incoming -- **SSH Access**: Only from local networks (192.168.x.x, 10.x.x.x, 172.16.x.x) -- **ICMP**: Limited to local networks only -- **Logging**: All dropped packets logged with "CORTEX-FIREWALL-DROP" prefix - -### Network Hardening (sysctl) -```nix -# IP forwarding disabled -net.ipv4.ip_forward = 0 - -# ICMP/route manipulation disabled -net.ipv4.conf.all.send_redirects = 0 -net.ipv4.conf.all.accept_redirects = 0 -net.ipv4.conf.all.accept_source_route = 0 - -# SYN flood protection -net.ipv4.tcp_syncookies = 1 -net.ipv4.tcp_max_syn_backlog = 2048 - -# Anti-spoofing -net.ipv4.conf.all.rp_filter = 1 -``` - -## Summary - -Cortex implements a defense-in-depth security posture suitable for a home AI server: - -βœ… **Minimal attack surface** - Only SSH open, only from LAN -βœ… **Strong authentication** - SSH keys only, no passwords -βœ… **User isolation** - Separate service user (friday) for AI workloads -βœ… **Comprehensive logging** - Auditd + journald with retention -βœ… **Automated blocking** - Fail2ban for brute force protection -βœ… **Network hardening** - Kernel-level security parameters -βœ… **Declarative config** - All security settings in version control - -This provides enterprise-grade security while maintaining ease of management through NixOS's declarative configuration approach. diff --git a/docs/FLEET-FUTURE.md b/docs/FLEET-FUTURE.md index 1ee8198..816b185 100644 --- a/docs/FLEET-FUTURE.md +++ b/docs/FLEET-FUTURE.md @@ -1,9 +1,12 @@ # Future Fleet Management with Colmena -**Reference guide for future Colmena integration and advanced fleet patterns.** +> **⚑ Update 2026-01-22**: We built [nixos-fleet](https://github.com/sygint/nixos-fleet) to solve this! +> This tool wraps Colmena with a better CLI. See [TODO-NIXOS-FLEET-MIGRATION.md](planning/TODO-NIXOS-FLEET-MIGRATION.md) for migration plan. -**Status:** πŸ“‹ Planning / Reference Only -**Target:** Month 3-4 (When Colmena supports newer flake syntax) +**Reference guide for Colmena concepts and advanced fleet patterns.** + +**Status:** πŸ“‹ Reference / Historical +**Original Target:** Month 3-4 (When Colmena supports newer flake syntax) --- diff --git a/docs/FLEET-SECRETS-INTEGRATION.md b/docs/FLEET-SECRETS-INTEGRATION.md deleted file mode 100644 index 4585a63..0000000 --- a/docs/FLEET-SECRETS-INTEGRATION.md +++ /dev/null @@ -1,234 +0,0 @@ -# Fleet & Secrets Management Integration - -**Unified workflow for deploying NixOS systems with encrypted secrets** - -## Overview - -The fleet management system now includes integrated secrets management, combining deployment automation with encrypted secrets handling through sops-nix. - -## Architecture - -``` -fleet.sh (Deployment) - β”‚ - β”œβ”€β†’ secrets-manager.sh (Secrets) - β”‚ β”‚ - β”‚ └─→ sops + age (Encryption) - β”‚ β”‚ - β”‚ └─→ nixos-secrets/ (Private Repo) - β”‚ - └─→ deploy-rs / nixos-anywhere - β”‚ - └─→ Target Systems (with sops-nix) -``` - -## Quick Commands - -### Secrets Management via Fleet - -```bash -# View configuration -./scripts/deployment/fleet.sh secrets config - -# Edit secrets -./scripts/deployment/fleet.sh secrets edit - -# Validate encryption -./scripts/deployment/fleet.sh secrets validate - -# Add system password -./scripts/deployment/fleet.sh secrets add-host nexus - -# Rotate password -./scripts/deployment/fleet.sh secrets rotate nexus - -# Check status -./scripts/deployment/fleet.sh secrets status -``` - -### Deployment with Secrets - -```bash -# Deploy new system with secrets validation -./scripts/deployment/fleet.sh deploy nexus --validate-secrets - -# Update existing system with secrets validation -./scripts/deployment/fleet.sh update nexus --validate-secrets -``` - -## Typical Workflow - -### 1. Add New System with Secrets - -```bash -# Create system configuration -mkdir -p systems/newsystem - -# Add secrets for the system -./scripts/deployment/fleet.sh secrets add-host newsystem -# Enter password or generate random - -# Commit secrets -cd ~/.config/nixos-secrets -git add secrets.yaml -git commit -m "add: newsystem secrets" -git push -``` - -### 2. Configure System to Use Secrets - -Edit `systems/newsystem/default.nix`: - -```nix -{ config, pkgs, ... }: -{ - imports = [ - # Import sops-nix module - ../../nixos-secrets - ]; - - # Configure sops - sops.secrets.newsystem-maintenance-password = { - sopsFile = ../../nixos-secrets/secrets.yaml; - }; - - # Use secret in user configuration - users.users.maintenance = { - isNormalUser = true; - hashedPasswordFile = config.sops.secrets.newsystem-maintenance-password.path; - extraGroups = [ "wheel" ]; - }; -} -``` - -### 3. Deploy with Validation - -```bash -# Build to validate configuration -./scripts/deployment/fleet.sh build newsystem - -# Deploy with secrets validation -./scripts/deployment/fleet.sh deploy newsystem --validate-secrets -``` - -### 4. Rotate Secrets - -```bash -# Rotate password -./scripts/deployment/fleet.sh secrets rotate newsystem - -# Commit updated secrets -cd ~/.config/nixos-secrets -git add secrets.yaml -git commit -m "rotate: newsystem password" -git push - -# Deploy updated configuration -cd ~/.config/nixos -./scripts/deployment/fleet.sh update newsystem --validate-secrets -``` - -## How Secrets Validation Works - -When you use `--validate-secrets`: - -1. **Checks encryption/decryption** - Ensures secrets file is valid -2. **Verifies system has secrets** - Warns if system not in secrets.yaml -3. **Blocks deployment on failure** - Prevents deploying with broken secrets - -Example: -```bash -./scripts/deployment/fleet.sh deploy nexus --validate-secrets -# Output: -# β„Ή Validating secrets before deployment... -# βœ“ βœ“ Decryption works -# βœ“ βœ“ Encryption works -# βœ“ Secrets validated for nexus -# β„Ή Building nexus configuration... -``` - -## Integration Points - -### In fleet.sh - -- `secrets_command()` - Passes through to secrets-manager.sh -- `validate_secrets()` - Validates before deployment -- Updated `deploy_system()` - Accepts `--validate-secrets` flag -- Updated `update_system()` - Accepts `--validate-secrets` flag - -### In secrets-manager.sh - -- All commands work independently or through fleet.sh -- Auto-detects secrets repository location -- Auto-detects age keys by hostname -- Configurable via environment variables - -## Security Model - -``` -Control Machine (Orion) -β”œβ”€β”€ Private age key (orion.txt) -β”œβ”€β”€ Can decrypt all secrets -└── Can edit secrets - -Target Machines (Cortex, Nexus) -β”œβ”€β”€ Private age key (cortex.txt, nexus.txt) -β”œβ”€β”€ Can decrypt only at boot time -β”œβ”€β”€ sops-nix handles decryption -└── Secrets available as read-only files -``` - -## Benefits - -1. **Unified Interface** - Single tool for deployment + secrets -2. **Validation Built-in** - Catch secrets issues before deployment -3. **Clear Workflow** - Documented path from secrets to deployment -4. **Security First** - Never commit unencrypted secrets -5. **Easy Rotation** - Simple commands for password rotation -6. **Fleet-wide Management** - Manage all system secrets in one place - -## Related Documentation - -- **Fleet Management**: `FLEET-MANAGEMENT.md` -- **Secrets Deep Dive**: `nixos-secrets/SECRETS-MANAGEMENT.md` -- **Security**: `docs/SECURITY.md` -- **Project Overview**: `docs/PROJECT-OVERVIEW.md` - -## Troubleshooting - -### Secrets validation fails - -```bash -# Check secrets directly -./scripts/deployment/fleet.sh secrets validate - -# View configuration -./scripts/deployment/fleet.sh secrets config - -# Check recipients -./scripts/deployment/fleet.sh secrets recipients -``` - -### System missing from secrets - -```bash -# Add the system -./scripts/deployment/fleet.sh secrets add-host - -# Or check if intentional (some systems may not need secrets) -./scripts/deployment/fleet.sh secrets cat | grep -``` - -### Can't decrypt on target - -1. Check age key exists: `/etc/age/keys.txt` (or configured path) -2. Verify key in `.sops.yaml` recipients -3. Re-encrypt secrets: `./scripts/deployment/fleet.sh secrets edit` -4. Check sops-nix configuration in system config - -## Examples - -See real examples in: -- `systems/nexus/default.nix` - Maintenance user with encrypted password -- `systems/cortex/default.nix` - Service credentials -- `nixos-secrets/secrets.yaml` - Encrypted secrets structure diff --git a/docs/IMPLEMENTATION-GUIDE.md b/docs/IMPLEMENTATION-GUIDE.md deleted file mode 100644 index 864becb..0000000 --- a/docs/IMPLEMENTATION-GUIDE.md +++ /dev/null @@ -1,937 +0,0 @@ -# QUICK-WINS Implementation Guide - -**Status:** Partially implemented - Core/Optional architecture and automated backups remain - -**Time Investment:** ~4-5 hours remaining (Core/Optional migration + backups) -**Impact:** Scalable architecture for 10+ systems, automated data protection - ---- - -## βœ… Completed Tasks - -- βœ… **Day 1-2: Deployment Safety** - Scripts created (pre-flight.sh, validate.sh, safe-deploy.sh) -- βœ… **Day 2: Just Automation** - justfile exists with task automation -- βœ… **Day 4: Documentation** - docs/PROJECT-OVERVIEW.md and ARCHITECTURE.md comprehensive - -## ⚠️ Remaining High-Value Tasks - -### Priority 1: Automated Backups (2 hours) -- ❌ **Day 6-7: Backup Setup** - Synology available, Borg not configured -- **Impact:** No data protection - CRITICAL -- **Next:** Follow Day 6-7 guide below to create backup.nix module - -### Priority 2: Core/Optional Architecture (4 hours) -- ❌ **Day 3-5: Module Reorganization** - All modules currently flat/optional -- **Impact:** Doesn't scale beyond 2-3 systems, repetitive configuration -- **Next:** Follow Day 3-5 guide below to create core/optional structure - -### Priority 3: Deployment Integration -- ⚠️ **Scripts exist but not primary workflow** - safe-deploy.sh not default -- **Impact:** Still possible to deploy without safety checks -- **Next:** Make safe-deploy.sh the default deployment method - ---- - -## βœ… Day 1: Deployment Safety (COMPLETED) - -**Status:** Scripts exist in `scripts/` directory - -**What was implemented:** -- βœ… `scripts/pre-flight.sh` - Pre-deployment validation -- βœ… `scripts/validate.sh` - Post-deployment checks -- βœ… `scripts/safe-deploy.sh` - Orchestration wrapper - -**Remaining work:** Integration into main deployment workflow (use safe-deploy.sh instead of direct deploy-rs) - -### 1. Pre-flight Validation Script (Reference) - -```bash -#!/usr/bin/env bash -# scripts/pre-flight.sh -# Usage: ./scripts/pre-flight.sh cortex 192.168.1.7 jarvis - -set -euo pipefail - -HOST=$1 -IP=$2 -USER=$3 - -echo "πŸ” Pre-flight checks for $HOST ($IP)..." -echo "" - -# Check 1: Network reachability -echo -n " [1/6] Network reachability... " -if ping -c 3 -W 2 $IP > /dev/null 2>&1; then - echo "βœ…" -else - echo "❌ FAIL: Host unreachable" - exit 1 -fi - -# Check 2: SSH connectivity -echo -n " [2/6] SSH connectivity... " -if timeout 5 ssh -o ConnectTimeout=5 -o StrictHostKeyChecking=no $USER@$IP "echo 'OK'" > /dev/null 2>&1; then - echo "βœ…" -else - echo "❌ FAIL: SSH connection failed" - echo " Check: SSH daemon running? Correct user/key?" - exit 1 -fi - -# Check 3: NixOS system -echo -n " [3/6] NixOS system check... " -if ssh $USER@$IP "[ -f /etc/NIXOS ]" 2>/dev/null; then - echo "βœ…" -else - echo "❌ FAIL: Not a NixOS system" - exit 1 -fi - -# Check 4: Disk space -echo -n " [4/6] Disk space... " -DISK_USAGE=$(ssh $USER@$IP "df -h / | tail -1 | awk '{print \$5}' | sed 's/%//'" 2>/dev/null) -if [ $DISK_USAGE -lt 90 ]; then - echo "βœ… (${DISK_USAGE}% used)" -else - echo "⚠️ WARN: Disk usage at ${DISK_USAGE}%" - echo " Consider cleaning up before deploy" -fi - -# Check 5: Critical services -echo -n " [5/6] Critical services... " -FAILED_SERVICES=0 -for svc in sshd NetworkManager; do - if ! ssh $USER@$IP "systemctl is-active $svc" > /dev/null 2>&1; then - echo "" - echo " ❌ $svc is not active" - FAILED_SERVICES=1 - fi -done -if [ $FAILED_SERVICES -eq 0 ]; then - echo "βœ…" -else - exit 1 -fi - -# Check 6: System load -echo -n " [6/6] System load... " -LOAD=$(ssh $USER@$IP "uptime | awk -F'load average:' '{print \$2}' | awk '{print \$1}' | sed 's/,//'" 2>/dev/null) -echo "βœ… (load: $LOAD)" - -echo "" -echo "βœ… All pre-flight checks passed!" -echo " Ready to deploy to $HOST" -``` - -### 2. Create Post-deployment Validation Script - -```bash -#!/usr/bin/env bash -# scripts/validate.sh -# Usage: ./scripts/validate.sh cortex 192.168.1.7 jarvis - -set -euo pipefail - -HOST=$1 -IP=$2 -USER=$3 - -echo "πŸ” Validating deployment to $HOST..." -echo "" - -# Wait a moment for services to settle -sleep 5 - -# Check 1: SSH still works -echo -n " [1/5] SSH connectivity... " -if timeout 10 ssh -o ConnectTimeout=10 $USER@$IP "echo 'OK'" > /dev/null 2>&1; then - echo "βœ…" -else - echo "❌ CRITICAL: Lost SSH access!" - echo " Manual intervention required" - exit 1 -fi - -# Check 2: System is running -echo -n " [2/5] System state... " -SYS_STATE=$(ssh $USER@$IP "systemctl is-system-running" 2>/dev/null || echo "unknown") -if echo "$SYS_STATE" | grep -qE "running|degraded"; then - echo "βœ… ($SYS_STATE)" -else - echo "⚠️ WARN: System state is $SYS_STATE" -fi - -# Check 3: Critical services -echo -n " [3/5] Critical services... " -FAILED=0 -for svc in sshd NetworkManager; do - if ! ssh $USER@$IP "systemctl is-active $svc" > /dev/null 2>&1; then - echo "" - echo " ❌ $svc is not active" - FAILED=1 - fi -done -if [ $FAILED -eq 0 ]; then - echo "βœ…" -else - echo " Some services failed - check systemctl status" -fi - -# Check 4: Boot generation changed -echo -n " [4/5] Boot generation... " -CURRENT_GEN=$(ssh $USER@$IP "readlink /run/current-system | grep -oP 'system-\K[0-9]+'" 2>/dev/null) -BOOTED_GEN=$(ssh $USER@$IP "readlink /run/booted-system | grep -oP 'system-\K[0-9]+'" 2>/dev/null) -if [ "$CURRENT_GEN" = "$BOOTED_GEN" ]; then - echo "βœ… (generation $CURRENT_GEN)" -else - echo "⚠️ Current: $CURRENT_GEN, Booted: $BOOTED_GEN (reboot pending)" -fi - -# Check 5: No failed units -echo -n " [5/5] Failed units... " -FAILED_COUNT=$(ssh $USER@$IP "systemctl list-units --state=failed --no-legend | wc -l" 2>/dev/null) -if [ "$FAILED_COUNT" -eq 0 ]; then - echo "βœ…" -else - echo "⚠️ $FAILED_COUNT failed units" - ssh $USER@$IP "systemctl list-units --state=failed" -fi - -echo "" -if [ $FAILED -eq 0 ]; then - echo "βœ… Validation passed!" -else - echo "⚠️ Validation completed with warnings" - exit 1 -fi -``` - -### 3. Update Deployment Workflow - -**Before:** (risky) -```bash -deploy-rs .#cortex -``` - -**After:** (safe) -```bash -#!/usr/bin/env bash -# scripts/safe-deploy.sh -# Usage: ./scripts/safe-deploy.sh cortex 192.168.1.7 jarvis - -set -euo pipefail - -HOST=$1 -IP=$2 -USER=$3 - -echo "πŸš€ Safe deployment to $HOST" -echo "" - -# Step 1: Pre-flight checks -./scripts/pre-flight.sh $HOST $IP $USER || { - echo "❌ Pre-flight checks failed. Aborting." - exit 1 -} - -echo "" -echo "πŸ“¦ Starting deployment..." - -# Step 2: Deploy with rollback on error -if deploy-rs --skip-checks false --rollback-on-error .#$HOST; then - echo "βœ… Deploy completed successfully" -else - echo "❌ Deploy failed" - exit 1 -fi - -echo "" - -# Step 3: Validate deployment -./scripts/validate.sh $HOST $IP $USER || { - echo "❌ Validation failed!" - echo "" - echo "⚠️ ROLLBACK RECOMMENDED" - echo " Run: deploy-rs --rollback .#$HOST" - exit 1 -} - -echo "" -echo "πŸŽ‰ Deployment successful and validated!" -``` - -### 4. Make Scripts Executable - -```bash -chmod +x scripts/{pre-flight,validate,safe-deploy}.sh -``` - -### 5. Test the Workflow - -```bash -# Test pre-flight checks -./scripts/pre-flight.sh cortex 192.168.1.7 jarvis - -# If that passes, do a safe deploy -./scripts/safe-deploy.sh cortex 192.168.1.7 jarvis -``` - ---- - -## βœ… Day 2: Just Automation (COMPLETED) - -**Status:** `justfile` exists with task automation - -**What was implemented:** -- βœ… justfile created with common commands -- βœ… Just installed in system packages - -**Usage:** See `just --list` for available commands - -### Goal (Reference) - -**Why:** Consistent commands, automatic secrets sync, fewer mistakes - -### 1. Install Just - -```nix -# Add to modules/system/core/default.nix or orion/default.nix -environment.systemPackages = with pkgs; [ - just -]; -``` - -### 2. Create Enhanced Justfile with Secrets Sync - -**Key Addition:** Automatic secrets sync via `rebuild-pre` hook (EmergentMind's pattern) - -```justfile -# justfile - Task automation for NixOS config -# Run `just` to see all commands - -# Default: show available commands -default: - @just --list - -# ====== PRE/POST HOOKS (EmergentMind Pattern) ====== - -# Run BEFORE every rebuild/deploy - syncs secrets automatically -rebuild-pre: update-secrets - @git add --intent-to-add . - -# Run AFTER rebuild - validate sops is working -rebuild-post: - @echo "βœ… Rebuild complete" - @systemctl --user is-active sops-nix.service > /dev/null && echo "βœ… sops-nix active" || echo "⚠️ sops-nix check manually" - -# Sync secrets from separate repo (HYBRID APPROACH) -update-secrets: - @echo "πŸ”„ Syncing secrets..." - @(cd ../nixos-secrets && git pull) || true - @nix flake update nixos-secrets --timeout 5 - @echo "βœ… Secrets synced" - -# ====== LOCAL OPERATIONS ====== - -# Rebuild Orion (laptop) with pre/post hooks -rebuild-orion: rebuild-pre && rebuild-post - sudo nixos-rebuild --flake .#orion switch - -# Rebuild Cortex (AI rig) - if running locally on Cortex -rebuild-cortex: rebuild-pre && rebuild-post - sudo nixos-rebuild --flake .#cortex switch - -# Rebuild with trace for debugging -rebuild-trace HOST: rebuild-pre && rebuild-post - sudo nixos-rebuild --flake .#{{HOST}} --show-trace switch - -# ====== UPDATE COMMANDS ====== - -# Update all flake inputs -update: - nix flake update - -# Update specific input -update-input INPUT: - nix flake update {{INPUT}} - -# Update + rebuild Orion -update-orion: update - just rebuild-orion - -# Update + deploy to Cortex -update-cortex: update rebuild-pre - just deploy-cortex - -# ====== REMOTE OPERATIONS ====== - -# Deploy to Cortex (with safety checks + secrets sync) -deploy-cortex: rebuild-pre - ./scripts/safe-deploy.sh cortex 192.168.1.7 jarvis - -# Pre-flight checks only (no deploy) -check-cortex: - ./scripts/pre-flight.sh cortex 192.168.1.7 jarvis - -# Validate Cortex (post-deploy check) -validate-cortex: - ./scripts/validate.sh cortex 192.168.1.7 jarvis - -# SSH into Cortex -ssh-cortex: - ssh jarvis@192.168.1.7 - -# Sync configs to remote host (without building) -sync-cortex: - rsync -av --exclude='.git' --exclude='result' --exclude='*.md' \ - . jarvis@192.168.1.7:~/.config/nixos - -# ====== SECRETS MANAGEMENT ====== - -# Edit secrets -edit-secrets: - sops ../nixos-secrets/secrets.yaml - -# Rekey all secrets (after adding new host/user keys) -rekey: - cd ../nixos-secrets && \ - for file in $(ls *.yaml); do sops updatekeys -y $$file; done - -# ====== FLEET MANAGEMENT ====== - -# Fleet status (your custom script) -fleet-status: - ./scripts/fleet.sh status - -# Fleet deploy (when you have multiple systems) -fleet-deploy HOSTS: - ./scripts/fleet.sh deploy {{HOSTS}} - -# ====== UTILITIES ====== - -# Check flake (validate all configs) -check: - nix flake check --show-trace - -# Format all Nix files -fmt: - nix fmt - -# Git status with context -status: - @git status - @echo "" - @echo "πŸ“¦ Current Generation:" - @readlink /run/current-system | grep -oP 'system-\K[0-9]+' - @echo "" - @echo "πŸ“¦ Flake Inputs:" - @nix flake metadata | grep -A 10 "Inputs:" - -# Show disk usage -disk: - @df -h / /home | grep -v tmpfs - -# Show recent builds -generations: - sudo nix-env --list-generations --profile /nix/var/nix/profiles/system | tail -10 - -# Clean old generations (keep last 5) -clean: - sudo nix-collect-garbage --delete-older-than 30d - sudo nixos-rebuild boot --flake .#$(hostname) -``` - -### 3. Test Just Commands - -```bash -# See all commands -just - -# Rebuild Orion (secrets auto-sync before build) -just rebuild-orion - -# Update flake + deploy to Cortex (secrets auto-sync) -just update-cortex - -# Manual secrets sync (usually automatic) -just update-secrets - -# Check Cortex before deploying -just check-cortex -``` - -### 4. Why This Approach Works - -**Hybrid Secrets Strategy (Current - Recommended for 2-3 hosts):** -- βœ… Keeps your `git+file:../nixos-secrets` (local repo, simple) -- βœ… Adds automatic sync via `rebuild-pre` hook (EmergentMind's discipline) -- βœ… Ensures secrets always current before ANY deploy -- βœ… No manual steps to forget -- βœ… Works perfectly for single-admin, 2-3 host setup - -**How it works:** -1. You run `just update-cortex` or `just deploy-cortex` -2. `rebuild-pre` hook automatically runs first -3. Secrets repo pulls latest changes -4. `nixos-secrets` flake input updated -5. Build includes current secrets -6. Deploy sends entire closure (with secrets) to Cortex - -**Future Enhancement (Phase 3 - 5+ hosts):** -When you reach homelab scale (Proxmox, Frigate, Jellyfin, etc.), consider: -```nix -# Move to remote git repo -nix-secrets = { - url = "git+ssh://git@homelab.local/nix-secrets.git?shallow=1"; - inputs = { }; -}; -``` - -Benefits of remote repo (later): -- Atomic rollbacks (secrets version tied to flake.lock) -- Multi-location management -- Team collaboration -- Better disaster recovery - -**But for now:** Local repo + automatic sync = simpler and sufficient - ---- - -## ❌ Day 3: Core/Optional Planning (NOT STARTED) - -**Status:** Architecture not implemented - all modules still in flat structure - -**Why this matters:** -- Current: All modules optional, explicit enables everywhere -- Goal: Core modules auto-imported on all systems (SSH, security, nix settings) -- Benefit: Scales to 10+ systems without repeating base configuration - -### 1. Audit Current Modules - -Create a file `MIGRATION-PLAN.md`: - -```markdown -# Core/Optional Migration Plan - -## Current Modules Audit - -### System Modules (modules/system/) - -**CORE (Universal on ALL systems):** -- [ ] base/nix.nix - Flakes, garbage collection -- [ ] base/security.nix - fail2ban, auditd, sysctl -- [ ] services/ssh.nix - SSH daemon (if you have this) -- [ ] User account: syg/jarvis - -**OPTIONAL (Selective per host):** -- [ ] hardware/bluetooth.nix -- [ ] hardware/audio.nix -- [ ] hardware/networking.nix -- [ ] services/mullvad.nix -- [ ] services/syncthing.nix -- [ ] services/virtualization.nix -- [ ] services/containerization.nix -- [ ] services/printing.nix -- [ ] windowManagers/hyprland.nix -- [ ] displayServers/wayland.nix - -### Home Modules (modules/home/programs/) - -**CORE (Universal for syg/jarvis):** -- [ ] git.nix -- [ ] zsh.nix -- [ ] btop.nix (basic monitoring) - -**OPTIONAL (Selective):** -- [ ] brave.nix -- [ ] librewolf.nix -- [ ] vscode.nix -- [ ] kitty.nix -- [ ] hyprland.nix (desktop-only) -- [ ] hyprpanel.nix (desktop-only) -- [ ] waybar.nix (desktop-only) -- [ ] devenv.nix -- [ ] protonmail-bridge.nix -- [ ] mullvad.nix -- [ ] screenshots.nix - -## Migration Steps (Do on Day 5) - -1. Create directories: - ```bash - mkdir -p modules/system/{core,optional,users} - mkdir -p modules/home/{core,optional} - ``` - -2. Move core system configs: - ```bash - mv modules/system/base/* modules/system/core/ - ``` - -3. Move optional system configs: - ```bash - mv modules/system/{hardware,services,programs,windowManagers,displayServers} modules/system/optional/ - ``` - -4. Create core/default.nix: - ```nix - # modules/system/core/default.nix - { - imports = [ - ./nix.nix - ./security.nix - # Add more core modules as identified - ]; - } - ``` - -5. Update host imports: - ```nix - # systems/orion/default.nix - imports = [ - ./hardware.nix - ../../modules/system/core # Auto-imports everything in core - ../../modules/system/optional/hardware/bluetooth.nix - ../../modules/system/optional/services/syncthing.nix - # etc. - ]; - ``` -``` - ---- - -## βœ… Day 4: Documentation Updates (COMPLETED) - -**Status:** Documentation comprehensive and up-to-date - -**What was completed:** -- βœ… docs/PROJECT-OVERVIEW.md updated (October 29, 2025) -- βœ… docs/ARCHITECTURE.md created (comprehensive module documentation) -- βœ… Known Issues section accurate and prioritized -- βœ… Current Status & Roadmap section added - -### Reference (Original Goals) - -### 1. Update README.md - -Add to "Common Tasks" section: - -```markdown -## Common Tasks - -### Using Just (Recommended) - -```bash -# See all available commands -just - -# Local operations -just rebuild-orion # Rebuild Orion locally -just rebuild-cortex # Rebuild Cortex locally -just update # Update flake inputs - -# Remote operations -just deploy-cortex # Deploy to Cortex with safety checks -just ssh-cortex # SSH into Cortex -just sync-cortex # Sync configs only (no build) - -# Validation -just check-cortex # Pre-flight checks before deploy -just validate-cortex # Validate after deploy -just check # Flake check - -# Fleet management -just fleet-status # Check all systems -``` - -### Manual Commands (if Just not available) - -[Keep existing content but mark as "Legacy"] -``` - -### 2. Update FLEET-MANAGEMENT.md - -Add new section at the top: - -```markdown -## Safe Deployment Workflow - -**ALWAYS use this workflow for remote deployments:** - -1. **Pre-flight Checks** - ```bash - just check-cortex - # or: ./scripts/pre-flight.sh cortex 192.168.1.7 jarvis - ``` - -2. **Commit Current State** (rollback point) - ```bash - git add -A - git commit -m "pre-deploy snapshot: $(date +%Y%m%d-%H%M)" - ``` - -3. **Deploy with Safety** - ```bash - just deploy-cortex - # or: ./scripts/safe-deploy.sh cortex 192.168.1.7 jarvis - ``` - -4. **Validation** (automatic in safe-deploy.sh) - - If validation fails, rollback is suggested - - Follow on-screen instructions - -**Never skip pre-flight checks!** They've prevented countless SSH lockouts. -``` - ---- - -## ❌ Day 5: Core/Optional Migration (NOT STARTED) - -**Status:** Depends on Day 3 planning - not implemented - -**Blockers:** -- Need to complete Day 3 module audit first -- Requires MIGRATION-PLAN.md creation -- Estimated time: 2-3 hours once planning done - -### 1. Execute Migration Plan (Pending Day 3) - -Follow the steps in `MIGRATION-PLAN.md` created on Day 3. - -### 2. Test Both Systems - -```bash -# Test Orion rebuild -just rebuild-orion - -# Test Cortex deploy -just deploy-cortex -``` - -### 3. Verify Everything Works - -```bash -# Check all services on Orion -systemctl --failed - -# Check all services on Cortex (remotely) -just ssh-cortex -systemctl --failed -``` - ---- - -## ❌ Day 6-7: Backup Setup (NOT STARTED) - -**Status:** No automated backups configured - -**Current situation:** -- ❌ Synology NAS available but not integrated -- ❌ No Borg backup module created -- ❌ No automated backup schedules - -**Impact:** No automated data protection (HIGH PRIORITY) - -**Estimated time:** 2 hours to implement - -### 1. Install Borg on Synology - -```bash -# SSH into Synology -ssh admin@synology.local # or your Synology IP - -# Create borg user -sudo useradd -m borg -sudo passwd borg # Set a password - -# Create backup directories -sudo mkdir -p /volume1/backups/{orion,cortex} -sudo chown -R borg:borg /volume1/backups -``` - -### 2. Test Manual Backup - -```bash -# On Orion, initialize repo -borg init --encryption=repokey-blake2 borg@synology.local:/volume1/backups/orion - -# Create test backup -borg create borg@synology.local:/volume1/backups/orion::test-$(date +%Y%m%d) \ - ~/Documents \ - ~/Pictures \ - ~/.config/nixos - -# List archives -borg list borg@synology.local:/volume1/backups/orion - -# If successful, delete test backup -borg delete borg@synology.local:/volume1/backups/orion::test-$(date +%Y%m%d) -``` - -### 3. Create Backup Module (Automated Setup - Day 7) - -```nix -# modules/system/optional/services/backup.nix -{ config, lib, pkgs, ... }: - -let - cfg = config.modules.services.backup; -in -{ - options.modules.services.backup = { - enable = lib.mkEnableOption "Borg backup to Synology"; - - synologyHost = lib.mkOption { - type = lib.types.str; - default = "synology.local"; - description = "Synology hostname or IP"; - }; - - paths = lib.mkOption { - type = lib.types.listOf lib.types.str; - default = []; - description = "Paths to backup"; - }; - }; - - config = lib.mkIf cfg.enable { - # Install Borg - environment.systemPackages = [ pkgs.borgbackup ]; - - # Borg backup service - services.borgbackup.jobs.synology = { - paths = cfg.paths; - - exclude = [ - "**/.cache" - "**/.local/share/Trash" - "**/node_modules" - "**/target" - "**/result" - "**/.direnv" - ]; - - repo = "borg@${cfg.synologyHost}:/volume1/backups/${config.networking.hostName}"; - - encryption = { - mode = "repokey-blake2"; - passCommand = "cat /run/secrets/borg-passphrase"; - }; - - compression = "auto,zstd"; - - startAt = "daily"; - - prune.keep = { - daily = 7; - weekly = 4; - monthly = 6; - }; - - preHook = '' - echo "Starting backup to Synology..." - ''; - - postHook = '' - echo "Backup completed at $(date)" - ''; - }; - - # Add borg passphrase secret - sops.secrets.borg-passphrase = { - sopsFile = ../../secrets.yaml; # or per-host secrets - }; - }; -} -``` - -### 4. Enable on Orion - -```nix -# systems/orion/default.nix -modules.services.backup = { - enable = true; - synologyHost = "synology.local"; # or IP address - paths = [ - "/home/syg/Documents" - "/home/syg/Pictures" - "/home/syg/.config" - "/etc/nixos" - ]; -}; -``` - ---- - -## Verification Checklist - -After completing all days: - -- [ ] Pre-flight script works and catches issues -- [ ] Validation script confirms successful deploys -- [ ] `just` commands are muscle memory -- [ ] Core/optional migration complete -- [ ] Can rebuild both systems successfully -- [ ] Manual Borg backup to Synology works -- [ ] (Optional) Automated Borg backup running - ---- - -## Next Steps (Week 2+) - -See `COMPARISON-ANALYSIS.md` for: -- Phase 2: YubiKey integration -- Phase 3: Homelab expansion (Proxmox, Frigate) -- Phase 4: Testing infrastructure, monitoring - ---- - -## Troubleshooting - -### Pre-flight Script Fails - -**Problem:** "SSH connection failed" -- Check: `ssh -v jarvis@192.168.1.7` (verbose output) -- Verify: SSH key is added (`ssh-add -L`) -- Verify: User exists on Cortex (`just ssh-cortex "whoami"`) - -**Problem:** "Not a NixOS system" -- Check: `just ssh-cortex "cat /etc/os-release"` -- Verify: You're deploying to correct IP - -### Validation Script Fails - -**Problem:** "Lost SSH access" -- **DO NOT PANIC** - System is still running -- Try: Wait 30 seconds, SSH may be restarting -- Try: Reboot from physical access -- Last resort: Boot from live USB, rollback to previous generation - -**Problem:** "Critical services failed" -- Check: `just ssh-cortex "systemctl status sshd"` -- Check: `just ssh-cortex "systemctl status NetworkManager"` -- Fix: Restart service or rollback - -### Just Commands Not Found - -**Problem:** `command not found: just` -- Install: Add to `environment.systemPackages` -- Rebuild: `sudo nixos-rebuild switch` -- Verify: `which just` - -### Borg Backup Fails - -**Problem:** "Connection refused" -- Check: Synology SSH is enabled (Control Panel > Terminal & SNMP) -- Check: borg user exists on Synology -- Test: `ssh borg@synology.local` (should prompt for password) - -**Problem:** "Repository not found" -- Check: Path exists: `ssh borg@synology.local "ls -la /volume1/backups/orion"` -- Reinitialize: `borg init --encryption=repokey-blake2 borg@synology.local:/volume1/backups/orion` - ---- - -**End of Quick Wins Guide** - -*These changes will immediately improve your deployment reliability and workflow consistency.* diff --git a/docs/ROADMAP.md b/docs/ROADMAP.md deleted file mode 100644 index 0ee2dbc..0000000 --- a/docs/ROADMAP.md +++ /dev/null @@ -1,441 +0,0 @@ -# Implementation Roadmap (Visual) - -``` -╔══════════════════════════════════════════════════════════════════════════╗ -β•‘ NIXOS CONFIGURATION IMPROVEMENT ROADMAP β•‘ -β•‘ β•‘ -β•‘ From: Unstable deploys, unclear architecture, no backups β•‘ -β•‘ To: Production-ready fleet with automated operations β•‘ -β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β• - - -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - WEEK 1: CRITICAL FOUNDATIONS (10 hours) - DO THESE FIRST -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - -Day 1-2: DEPLOYMENT SAFETY βœ… COMPLETED -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Status: Scripts created and tested β”‚ -β”‚ β”‚ -β”‚ βœ… scripts/pre-flight.sh - Validates before deploy β”‚ -β”‚ βœ… scripts/validate.sh - Confirms after deploy β”‚ -β”‚ βœ… scripts/safe-deploy.sh - Orchestrates both β”‚ -β”‚ β”‚ -β”‚ Remaining: Integrate as default deployment method β”‚ -β”‚ Impact: 90% of deployment failures preventable β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - ↓ -Day 3: JUST AUTOMATION βœ… COMPLETED -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Status: justfile exists with task automation β”‚ -β”‚ β”‚ -β”‚ βœ… justfile created with rebuild/deploy/check commands β”‚ -β”‚ βœ… Just installed in system packages β”‚ -β”‚ β”‚ -β”‚ Usage: β”‚ -β”‚ βœ… just --list (see all commands) β”‚ -β”‚ βœ… Consistent workflows, fewer mistakes β”‚ -β”‚ β”‚ -β”‚ Impact: Standardized operations across systems β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - ↓ -Day 4: DOCUMENTATION βœ… COMPLETED | Day 5: CORE/OPTIONAL ❌ NOT STARTED -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Day 4 Status: Documentation comprehensive and up-to-date β”‚ -β”‚ βœ… docs/PROJECT-OVERVIEW.md updated (Oct 29, 2025) β”‚ -β”‚ βœ… docs/ARCHITECTURE.md created (500+ lines) β”‚ -β”‚ βœ… IMPLEMENTATION-GUIDE.md updated with status tracking β”‚ -β”‚ β”‚ -β”‚ Day 5 Status: Core/Optional architecture NOT implemented β”‚ -β”‚ ❌ Module structure still flat (base/hardware/services) β”‚ -β”‚ ❌ Blocked by Day 3 planning (audit needed) β”‚ -β”‚ β”‚ -β”‚ Target Architecture: Current State: β”‚ -β”‚ modules/system/ modules/system/ β”‚ -β”‚ β”œβ”€β”€ core/ β”œβ”€β”€ base/ β”‚ -β”‚ β”‚ β”œβ”€β”€ nix.nix β”œβ”€β”€ hardware/ β”‚ -β”‚ β”‚ β”œβ”€β”€ security.nix β”œβ”€β”€ services/ β”‚ -β”‚ β”‚ └── ssh.nix └── wayland/ β”‚ -β”‚ β”œβ”€β”€ optional/ β”‚ -β”‚ β”‚ β”œβ”€β”€ hardware/ β”‚ -β”‚ β”‚ β”œβ”€β”€ services/ β”‚ -β”‚ β”‚ └── wayland/ β”‚ -β”‚ └── users/ β”‚ -β”‚ β”‚ -β”‚ Remaining: Audit modules, create MIGRATION-PLAN.md (est 4hrs) β”‚ -β”‚ Impact: Scales to 10+ systems, clarity, consistency β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - ↓ -Day 6-7: AUTOMATED BACKUPS ❌ NOT STARTED (HIGH PRIORITY) -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Status: NO AUTOMATED BACKUPS - Synology DS-920+ unused β”‚ -β”‚ β”‚ -β”‚ Current Situation: β”‚ -β”‚ ❌ No data protection for either system β”‚ -β”‚ ❌ Synology available but not configured β”‚ -β”‚ ❌ Borg not installed or tested β”‚ -β”‚ β”‚ -β”‚ Plan (2 hours estimated): β”‚ -β”‚ Day 6: Manual Borg backup to Synology (test) β”‚ -β”‚ Day 7: Automated module with systemd timer β”‚ -β”‚ β”‚ -β”‚ Target Result: β”‚ -β”‚ βœ… Orion β†’ Synology (daily, 7d/4w/6m retention) β”‚ -β”‚ βœ… Cortex β†’ Synology (daily, 7d/4w/6m retention) β”‚ -β”‚ β”‚ -β”‚ Priority: P1 (HIGHEST) - Critical data protection gap β”‚ -β”‚ Impact: Data protection, disaster recovery β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - ↓ - ⏳ Week 1 PARTIALLY COMPLETE - βœ… Deployment safety (scripts created & tested) - βœ… Just automation (justfile operational) - βœ… Documentation (comprehensive & accurate) - ❌ Core/Optional architecture (not started - est 4hrs) - ❌ Automated backups (not started - HIGH PRIORITY, est 2hrs) - - -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - WEEK 2-4: HIGH PRIORITY (15 hours) -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - -Week 2: DOCUMENTATION & SECRETS βœ… COMPLETED -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ βœ… docs/PROJECT-OVERVIEW.md updated (Oct 29, 2025) β”‚ -β”‚ βœ… docs/ARCHITECTURE.md created (comprehensive) β”‚ -β”‚ βœ… .sops.yaml configured with creation rules β”‚ -β”‚ βœ… `just rekey` automation added to justfile β”‚ -β”‚ βœ… Secrets workflow documented in SECRETS-*.md files β”‚ -β”‚ β”‚ -β”‚ Files: docs/PROJECT-OVERVIEW.md, ARCHITECTURE.md, SECRETS-*.md β”‚ -β”‚ Access: nixos-secrets/ repo with age encryption β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - -Week 3: COMPLETE CORTEX PROVISIONING βœ… COMPLETED -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Status: RTX 5090 configured with AI services operational β”‚ -β”‚ β”‚ -β”‚ βœ… NVIDIA drivers (open kernel modules for Blackwell) β”‚ -β”‚ βœ… CUDA toolkit (with uvm_disable_hmm=1 workaround) β”‚ -β”‚ βœ… RTX 5090 functionality (32GB VRAM accessible) β”‚ -β”‚ βœ… Ollama with CUDA acceleration β”‚ -β”‚ βœ… 6 models loaded (llama3.2:3b β†’ mixtral:8x7b) β”‚ -β”‚ βœ… GPU-accelerated inference tested and working β”‚ -β”‚ βœ… modules/system/ai-services documented β”‚ -β”‚ β”‚ -β”‚ Note: Open WebUI disabled temporarily (ctranslate2 build) β”‚ -β”‚ Access: ssh jarvis@192.168.1.7, Ollama API port 11434 β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - -Week 4: YUBIKEY INTEGRATION ❌ NOT STARTED (Optional) -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Status: Optional security enhancement, not yet implemented β”‚ -β”‚ β”‚ -β”‚ Planned Tasks: β”‚ -β”‚ ❌ Configure PAM for U2F β”‚ -β”‚ ❌ Register YubiKey on Orion β”‚ -β”‚ ❌ Register YubiKey on Cortex β”‚ -β”‚ ❌ Test: sudo with touch β”‚ -β”‚ ❌ Test: SSH with YubiKey β”‚ -β”‚ ❌ Test: Git signing with YubiKey β”‚ -β”‚ β”‚ -β”‚ Priority: Low (enhancement, not critical) β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - - -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - MONTH 2: ENHANCEMENT (20 hours) -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - -CUSTOM LIBRARY FUNCTIONS -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Create: lib/custom.nix β”‚ -β”‚ - autoImport helper (replaces fileFilter) β”‚ -β”‚ - Marvel-themed user helpers β”‚ -β”‚ - Custom library extensions β”‚ -β”‚ β”‚ -β”‚ Update: flake.nix to extend lib β”‚ -β”‚ Use: lib.custom throughout configs β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - -TESTING INFRASTRUCTURE -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Add: pre-commit-hooks to flake inputs β”‚ -β”‚ Create: checks.nix β”‚ -β”‚ - nixfmt (formatting) β”‚ -β”‚ - statix (linting) β”‚ -β”‚ - deadnix (unused code detection) β”‚ -β”‚ - shellcheck (script validation) β”‚ -β”‚ β”‚ -β”‚ Set up: Git pre-commit hook β”‚ -β”‚ Run: `just check` before every commit β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - -VPN REMOTE ACCESS -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Evaluate: Headscale for remote access β”‚ -β”‚ Deploy: VPN server on Proxmox (or Orion) β”‚ -β”‚ Configure: All systems as VPN clients β”‚ -β”‚ Test: Remote access from outside network β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - - -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - MONTH 3: HOMELAB EXPANSION (30+ hours) -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - -FLEET MANAGEMENT EXPANSION ❌ NOT STARTED (3-4 hours) -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Status: Using deploy-rs, planning Colmena migration β”‚ -β”‚ β”‚ -β”‚ Current Tools: β”‚ -β”‚ βœ… deploy-rs - Sequential deployments β”‚ -β”‚ βœ… just - Task automation β”‚ -β”‚ βœ… fleet.sh - System discovery and management β”‚ -β”‚ βœ… safe-deploy.sh - Pre/post-flight checks β”‚ -β”‚ β”‚ -β”‚ Planned Migration to Colmena (when ready): β”‚ -β”‚ ❌ Parallel deployment to multiple systems β”‚ -β”‚ ❌ Tag-based targeting (@server, @ai, @homelab) β”‚ -β”‚ ❌ Simpler fleet configuration β”‚ -β”‚ ❌ Fleet-wide health checks and monitoring β”‚ -β”‚ β”‚ -β”‚ Prerequisites: β”‚ -β”‚ - Colmena supports current flake syntax β”‚ -β”‚ - Managing 5+ systems (worth the migration effort) β”‚ -β”‚ - Need parallel deployment time savings β”‚ -β”‚ β”‚ -β”‚ Implementation: β”‚ -β”‚ 1. Add Colmena to flake inputs (30 min) β”‚ -β”‚ 2. Configure colmena output with tags (1 hour) β”‚ -β”‚ 3. Test parallel deployment (1 hour) β”‚ -β”‚ 4. Update justfile commands (30 min) β”‚ -β”‚ 5. Update fleet.sh to use Colmena (1 hour) β”‚ -β”‚ β”‚ -β”‚ Benefits: β”‚ -β”‚ - Deploy to 5+ systems in ~5 minutes (vs 25+ sequential) β”‚ -β”‚ - Tag-based updates (update @server, @homelab, etc.) β”‚ -β”‚ - Simplified fleet operations β”‚ -β”‚ - Fleet-wide command execution (colmena exec) β”‚ -β”‚ β”‚ -β”‚ Reference: docs/FLEET-FUTURE.md β”‚ -β”‚ Priority: Medium (wait until 5+ systems) β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - -EXPANDED JUST AUTOMATION βœ… IN PROGRESS (Ongoing) -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Status: Basic justfile operational, expansion planned β”‚ -β”‚ β”‚ -β”‚ Current Commands: β”‚ -β”‚ βœ… just deploy-cortex (with safety checks) β”‚ -β”‚ βœ… just rebuild-orion (local rebuild) β”‚ -β”‚ βœ… just update-secrets (automatic sync) β”‚ -β”‚ βœ… just check-cortex (pre-flight) β”‚ -β”‚ βœ… just rekey (secrets management) β”‚ -β”‚ β”‚ -β”‚ Planned Expansions: β”‚ -β”‚ ❌ just deploy-all (when Colmena added) β”‚ -β”‚ ❌ just fleet-status (health check all systems) β”‚ -β”‚ ❌ just fleet-uptime (uptime across fleet) β”‚ -β”‚ ❌ just backup-all (trigger backups on all systems) β”‚ -β”‚ ❌ just monitor (open Grafana dashboard - Month 4+) β”‚ -β”‚ β”‚ -β”‚ Priority: Low (incremental improvement) β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - -PROXMOX SERVER -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Plan: Server specs, disk layout β”‚ -β”‚ Build: Physical server β”‚ -β”‚ Create: systems/proxmox/ config β”‚ -β”‚ Bootstrap: With your new safe-deploy.sh β”‚ -β”‚ Deploy: `just deploy-proxmox` β”‚ -β”‚ Validate: All services running β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - -FRIGATE NVR (on Proxmox VM) -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Create: VM for Frigate β”‚ -β”‚ Configure: Camera streams β”‚ -β”‚ Set up: Motion detection, recording β”‚ -β”‚ Integrate: With Home Assistant β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - -JELLYFIN MEDIA SERVER (on Proxmox VM) -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Create: VM for Jellyfin β”‚ -β”‚ Configure: Media libraries β”‚ -β”‚ Set up: Hardware transcoding β”‚ -β”‚ Test: Streaming to devices β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - -HOME ASSISTANT (on Proxmox VM) -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Create: VM for Home Assistant β”‚ -β”‚ Integrate: Frigate cameras β”‚ -β”‚ Configure: Automations β”‚ -β”‚ Set up: Dashboard β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - - ↓ - YOUR FLEET -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ βœ… Orion - Laptop (Dev workstation) β”‚ -β”‚ βœ… Cortex - AI rig (LLM inference, gaming) β”‚ -β”‚ βœ… Proxmox - Homelab server β”‚ -β”‚ β”œβ”€β”€ Frigate (NVR) β”‚ -β”‚ β”œβ”€β”€ Jellyfin (Media) β”‚ -β”‚ └── Home Assistant (Automation) β”‚ -β”‚ βœ… Synology - Backup target (DS-920+) β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - - -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - MONTH 4+: ADVANCED (Optional) -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - -MONITORING STACK -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Deploy: Prometheus + Grafana β”‚ -β”‚ Monitor: β”‚ -β”‚ - System metrics (CPU, RAM, disk) β”‚ -β”‚ - GPU usage (RTX 5090) β”‚ -β”‚ - Backup success/failure β”‚ -β”‚ - Deployment status β”‚ -β”‚ β”‚ -β”‚ Alert: Slack/Discord when issues detected β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - -IMPERMANENCE (EmergentMind's Stage 8) -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Concept: Root filesystem ephemeral, only /persist survives β”‚ -β”‚ Benefits: Clean state, forced intentional persistence β”‚ -β”‚ Warning: Complex to implement correctly β”‚ -β”‚ Status: Not urgent, wait until stable foundation β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - -SECURE BOOT (Lanzaboote) -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Concept: UEFI Secure Boot for NixOS β”‚ -β”‚ Benefits: Prevents bootkit malware β”‚ -β”‚ Status: Advanced feature, Month 6+ β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - -OFFSITE BACKUPS -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Set up: Cloud backup (Backblaze B2 or Wasabi) β”‚ -β”‚ Schedule: Monthly encrypted uploads β”‚ -β”‚ Retention: 1 year β”‚ -β”‚ Cost: ~$5-10/month for 500GB β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - - -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - PROGRESS MILESTONES -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - -β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” -β”‚ Milestone β”‚ Success Criteria β”‚ -β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ -β”‚ Week 1 ⏳ β”‚ βœ… 5+ successful deploys without SSH loss β”‚ -β”‚ β”‚ βœ… Using Just for all operations β”‚ -β”‚ β”‚ ❌ Core/optional architecture (not started) β”‚ -β”‚ β”‚ ❌ Daily backups to Synology (HIGH PRIORITY) β”‚ -β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ -β”‚ Month 1 ⏳ β”‚ βœ… Cortex fully provisioned (GPU, CUDA, LLMs) β”‚ -β”‚ β”‚ ❌ YubiKey integration (optional, not started) β”‚ -β”‚ β”‚ βœ… Documentation updated throughout β”‚ -β”‚ β”‚ βœ… Secrets automation with Just β”‚ -β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ -β”‚ Month 3 ❌ β”‚ ❌ Proxmox server operational β”‚ -β”‚ β”‚ ❌ Frigate NVR deployed β”‚ -β”‚ β”‚ ❌ Jellyfin media server running β”‚ -β”‚ β”‚ ❌ Home Assistant integrated β”‚ -β”‚ β”‚ ❌ Full fleet managed with Just β”‚ -β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ -β”‚ Month 6 ❌ β”‚ ❌ Monitoring stack (Prometheus, Grafana) β”‚ -β”‚ β”‚ ❌ Testing infrastructure (CI/CD) β”‚ -β”‚ β”‚ ❌ Offsite backups to cloud β”‚ -β”‚ β”‚ ❌ Production-grade infrastructure β”‚ -β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ - - -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - KEY PRINCIPLES -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - -1. START SMALL - β€’ Focus on P0 (Critical) items first - β€’ Don't try to do everything at once - β€’ Validate after each change - -2. BUILD HABITS - β€’ Use `just` for all operations - β€’ Run pre-flight before every deploy - β€’ Commit before risky changes - β€’ Update docs/TODO-CHECKLIST.md daily - -3. DOCUMENT LEARNINGS - β€’ What worked well? - β€’ What was harder than expected? - β€’ What would you do differently? - β€’ Share with community (optional) - -4. ITERATE & IMPROVE - β€’ Your config will evolve - β€’ EmergentMind took 2 years, 11 stages - β€’ Progress over perfection - β€’ Enjoy the journey! - - -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - RESOURCES -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - -πŸ“– Your Documentation - β€’ COMPARISON-ANALYSIS.md - Detailed analysis & recommendations - β€’ IMPLEMENTATION-GUIDE.md - Week 1 implementation guide - β€’ docs/TODO-CHECKLIST.md - Progress tracker - β€’ ANALYSIS-SUMMARY.md - This summary - -πŸ“– EmergentMind's Resources - β€’ GitHub: github.com/EmergentMind/nix-config - β€’ Website: unmovedcentre.com - β€’ Anatomy Article: unmovedcentre.com/posts/anatomy-of-a-nixos-config - β€’ YouTube: youtube.com/@Emergent_Mind - -πŸ“– Learning Resources - β€’ VimJoyer: youtube.com/@vimjoyer (excellent tutorials) - β€’ Misterio77's Starter: github.com/Misterio77/nix-starter-configs - β€’ NixOS Wiki: nixos.wiki - β€’ NixOS Discourse: discourse.nixos.org - - -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ - - πŸš€ CURRENT STATUS (October 29, 2025) - - Week 1: ⏳ PARTIALLY COMPLETE - βœ… Deployment Safety (Day 1-2) - Scripts created & tested - βœ… Just Automation (Day 3) - justfile operational - βœ… Documentation (Day 4) - Comprehensive & accurate - ❌ Core/Optional Architecture (Day 5) - Not started - ❌ Automated Backups (Day 6-7) - HIGH PRIORITY - - 🎯 NEXT PRIORITIES: - - 1. P1 (CRITICAL): Automated Backups - 2 hours - β†’ No data protection currently! Synology available but unused - β†’ See IMPLEMENTATION-GUIDE.md Day 6-7 for implementation - - 2. P2 (HIGH): Core/Optional Architecture - 4 hours - β†’ Blocked by Day 3 planning (audit + MIGRATION-PLAN.md) - β†’ Enables scaling to 10+ systems - - 3. P3 (MEDIUM): Integrate Pre-flight Scripts - 1 hour - β†’ Make safe-deploy.sh the default deployment method - - Remember: Progress over perfection. Focus on P1 first! 🎯 - -━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ -``` diff --git a/docs/SECURITY-ROADMAP.md b/docs/SECURITY-ROADMAP.md deleted file mode 100644 index 4717801..0000000 --- a/docs/SECURITY-ROADMAP.md +++ /dev/null @@ -1,440 +0,0 @@ -# Security Roadmap - -## Future Security Enhancements for Homelab πŸ›‘οΈ - -**Status:** Conceptual / Not Yet Implemented -**Purpose:** Document potential security improvements and architectures to consider for future homelab expansion - -> **Note:** This document describes *potential* security architectures, not the current implementation. See `systems/cortex/default.nix` and other system configs for actual current security measures. - -## User Isolation Strategy Concepts - -### Current Setup (As Implemented) -``` -Cortex (AI Server): -└── jarvis: πŸ‘€ Admin user - β”œβ”€β”€ SSH: βœ… Key-only auth - β”œβ”€β”€ Sudo: βœ… Full access - └── Service Users: - └── friday: πŸ€– AI services (isolated) - -Orion (Workstation): -└── syg: πŸ‘€ Primary user - β”œβ”€β”€ Desktop environment - └── Development tools -``` - -### Potential Enhanced Architecture (Future Consideration) -``` -Concept: Functional Role-Based Access -└── Admin tier: - β”œβ”€β”€ admin: πŸ‘€ System administration - └── Service tier: - β”œβ”€β”€ ai-svc: πŸ€– AI services (Ollama, etc.) - β”œβ”€β”€ monitor-svc: πŸ›‘οΈ Security monitoring (fail2ban, intrusion detection) - └── metrics-svc: οΏ½ Analytics services (Prometheus, Grafana) -``` - -### Potential Security Improvements to Consider - -| Feature | Current State | Potential Enhancement | Benefit | -|---------|---------------|----------------------|---------| -| **User Naming** | Descriptive (jarvis, friday) | Role-based (admin, ai-svc) | βœ… Clearer purpose | -| **Service Isolation** | Some isolation (friday user) | Full isolation per service | βœ… Better containment | -| **Network Segmentation** | Ubiquiti firewall rules | VLANs + microsegmentation | βœ… Layer 2 isolation | -| **Audit Granularity** | System-level auditd | Per-user audit trails | βœ… Enhanced forensics | -| **Secret Management** | sops-nix (current) | Vault or similar | βœ… Dynamic secrets | -| **Zero Trust** | Firewall + SSH keys | mTLS + per-service auth | βœ… Defense in depth | - -## Network Architecture Considerations - -### Current Homelab Setup -``` -Internet β†’ Ubiquiti UDM/USW -β”œβ”€β”€ Orion (Workstation) - 192.168.1.100 -β”œβ”€β”€ Cortex (AI Server) - 192.168.1.7 -β”œβ”€β”€ Nexus (Homelab Services) - 192.168.1.10 -└── Synology DS920+ - 192.168.1.50 -``` - -### Potential VLAN Segmentation (Future) -``` -Management VLAN (10): -β”œβ”€β”€ Ubiquiti devices -└── Admin workstations - -Server VLAN (20): -β”œβ”€β”€ Cortex (AI) -└── Nexus (Services) - -IoT/Camera VLAN (30): -β”œβ”€β”€ Security cameras (when added) -└── Smart home devices - -Storage VLAN (40): -└── Synology NAS -``` - -## Potential Operational Improvements - -### Service User Isolation Pattern -```nix -# Example pattern for isolated service users -users.users.servicename = { - isSystemUser = true; - group = "servicename"; - home = "/var/lib/servicename"; - createHome = true; -}; - -# Limited sudo rules -security.sudo.extraRules = [{ - users = [ "servicename" ]; - commands = [{ - command = "/run/current-system/sw/bin/systemctl restart servicename-*"; - options = [ "NOPASSWD" ]; - }]; -}]; -``` - -### Ubiquiti Integration Ideas - -**Firewall Rules (UDM/USG):** -- Geo-blocking for SSH (allow only home country) -- Rate limiting on management ports -- IDS/IPS for anomaly detection - -**Network Policies:** -- IoT device isolation (cameras can't reach internet) -- Inter-VLAN rules (storage only accessible from server VLAN) -- Guest network completely isolated - -## Security Principles Worth Considering - -### Currently Implemented βœ… -1. **SSH Key-Only Authentication** - No passwords accepted -2. **Root Login Disabled** - Must sudo from user account -3. **Fail2ban** - Automatic IP blocking for brute force attempts -4. **Audit Logging** - Track security-relevant events -5. **Firewall** - Restrictive rules, LAN-only access -6. **Service Isolation** - Separate user for AI services (friday) - -### Potential Future Enhancements ⏳ -1. **Network Segmentation** - VLANs for different security zones -2. **mTLS** - Mutual TLS for service-to-service communication -3. **Hardware Security Keys** - YubiKey/Nitrokey for admin access -4. **Intrusion Detection** - Suricata/Snort on Ubiquiti -5. **Automated Backups** - Encrypted, tested backup strategy (see Infrastructure section below) -6. **Secret Rotation** - Automatic credential rotation -7. **Monitoring Dashboard** - Centralized security monitoring (Grafana/Prometheus) - -## Infrastructure & Automation Roadmap - -### Backup Strategy (High Priority) - -**Goal:** Automated encrypted backups to Synology DS-920+ - -**Borg Backup Implementation:** -```nix -# Potential configuration pattern -services.borgbackup.jobs.synology = { - paths = [ - "/home" - "/var/lib" - "/etc/nixos" - ]; - repo = "borg@synology.local:/volume1/backups/nixos"; - encryption = { - mode = "repokey-blake2"; - passCommand = "cat /run/secrets/borg-passphrase"; - }; - compression = "auto,zstd"; - startAt = "daily"; - prune.keep = { - daily = 7; - weekly = 4; - monthly = 6; - }; -}; -``` - -**Benefits:** -- Deduplication (saves space) -- Encryption at rest -- Incremental backups (fast) -- Retention policy (automatic cleanup) -- Works with existing Synology - -**Systems to Back Up:** -1. **Orion** - Development work, dotfiles, home directory -2. **Cortex** - AI models, datasets, configurations -3. **Nexus** - Homelab services data - -### Task Automation with Just - -**Goal:** Standardize common workflows with a task runner - -**Why Just?** -- Simpler syntax than Makefiles -- Better error messages -- Self-documenting (`just --list`) -- Cross-platform (works on macOS for future Darwin configs) - -**Comprehensive Justfile Example:** -```just -# Default: show all available commands -default: - @just --list - -# ====== LOCAL OPERATIONS ====== - -# Rebuild local system (use 'nos' alias or this) -rebuild: - sudo nixos-rebuild switch --flake . - -# Rebuild specific host locally -rebuild-host HOST: - sudo nixos-rebuild switch --flake .#{{HOST}} - -# Rebuild with full trace for debugging -rebuild-trace HOST: - sudo nixos-rebuild switch --flake .#{{HOST}} --show-trace - -# Update all flake inputs -update: - nix flake update - -# Update specific input -update-input INPUT: - nix flake update {{INPUT}} - -# ====== FLEET OPERATIONS ====== - -# List all systems -fleet-list: - ./scripts/fleet.sh list - -# Deploy to remote system -deploy SYSTEM: - ./scripts/fleet.sh deploy {{SYSTEM}} - -# Check system health -check SYSTEM: - ./scripts/fleet.sh check {{SYSTEM}} - -# Build system configuration (no deploy) -build SYSTEM: - ./scripts/fleet.sh build {{SYSTEM}} - -# ====== SECRETS MANAGEMENT ====== - -# Edit secrets for a system -edit-secrets SYSTEM: - sops ~/.config/nixos-secrets/secrets.yaml - -# Rekey all SOPS secrets after key changes -rekey: - @echo "Rekeying secrets..." - cd ~/.config/nixos-secrets && \ - sops updatekeys secrets.yaml && \ - git add secrets.yaml && \ - git commit -m "chore: rekey secrets" && \ - git push - -# Update secrets flake input -update-secrets: - nix flake lock --update-input nixos-secrets - -# ====== DEVELOPMENT ====== - -# Validate all configurations build -check: - nix flake check --show-trace - -# Format all Nix files -fmt: - nixfmt **/*.nix - -# Show flake metadata and inputs -info: - nix flake metadata - -# Clean old build artifacts -clean: - rm -rf result - nix-collect-garbage -d - -# ====== GIT OPERATIONS ====== - -# Git status with flake info -status: - @git status - @echo "\nπŸ“¦ Flake Inputs:" - @nix flake metadata | grep -A 10 "Inputs:" - -# Commit with conventional commit message -commit MSG: - git add -A - git commit -m "{{MSG}}" - -# ====== QUICK ACCESS ====== - -# SSH into cortex -ssh-cortex: - ssh jarvis@cortex - -# SSH into orion -ssh-orion: - ssh syg@orion -``` - -**Usage Examples:** -```bash -# See all available commands -just - -# Local rebuild -just rebuild - -# Deploy to remote system -just deploy cortex - -# Update everything and rebuild -just update -just rebuild - -# Manage secrets -just edit-secrets cortex -just rekey - -# Development workflow -just check # Validate configs -just fmt # Format code -just commit "feat: add new module" -``` - -**Integration with Existing Tools:** -- Wraps `fleet.sh` for common tasks -- Standardizes `nos` and manual rebuilds -- Simplifies secrets management -- Provides memorable aliases for SSH - -### Code Quality Automation - -**Pre-commit Hooks:** -```nix -# Potential addition to devenv.nix or flake.nix -pre-commit.hooks = { - nixfmt.enable = true; # Format Nix files - statix.enable = true; # Lint Nix code - deadnix.enable = true; # Find unused code - check-merge-conflict = true; -}; -``` - -**Benefits:** -- Consistent code formatting -- Catch errors before commits -- Maintain code quality -- Prevent broken configs in git - -### Home Manager Profiles Pattern (Future Scaling) - -**Goal:** Reduce duplication across multiple systems with reusable profiles - -**When to Implement:** When adding 4+ systems (Proxmox VMs, additional workstations) - -**Pattern:** -``` -modules/home/ -β”œβ”€β”€ profiles/ # Bundles of related modules -β”‚ β”œβ”€β”€ desktop.nix # Full desktop environment -β”‚ β”œβ”€β”€ minimal.nix # Shell-only (servers/VMs) -β”‚ └── development.nix # Development tools -└── programs/ # Individual programs (current) -``` - -**Example Implementation:** -```nix -# modules/home/profiles/desktop.nix -{ ... }: { - modules.programs = { - # Window manager - hyprland.enable = true; - hyprpanel.enable = true; - hypridle.enable = true; - - # Browsers - brave.enable = true; - librewolf.enable = true; - - # Development - vscode.enable = true; - zsh.enable = true; - screenshots.enable = true; - }; -} - -# modules/home/profiles/minimal.nix -{ ... }: { - modules.programs = { - zsh.enable = true; # Just shell essentials - }; -} - -# Usage in system configs: -# systems/orion/homes/syg.nix -imports = [ ../../../../modules/home/profiles/desktop.nix ]; - -# systems/proxmox-vm/homes/admin.nix -imports = [ ../../../../modules/home/profiles/minimal.nix ]; -``` - -**Benefits:** -- Single source of truth for "desktop" configuration -- Easy to maintain consistency across workstations -- Quick setup for new systems -- Clear separation between profiles (desktop vs server) - -**Reference:** Pattern inspired by Misterio77 and m3tam3re's configs - -### Monitoring & Alerting (Future) - -**Potential Stack:** -- **Prometheus** - Metrics collection -- **Grafana** - Visualization -- **Loki** - Log aggregation -- **Alertmanager** - Notifications - -**What to Monitor:** -- System resources (CPU, RAM, disk) -- Service health (fail2ban, sshd, etc.) -- Backup success/failure -- Disk space warnings -- Temperature (GPU on Cortex) - -## Implementation Considerations - -### What Works Well for Homelab -- NixOS declarative configuration (reproducible security) -- SSH key authentication (simple, effective) -- Ubiquiti ecosystem (integrated firewall/IDS/IPS) -- Tailscale (secure remote access without port forwarding) - -### What Might Be Overkill -- Enterprise IAM systems (Keycloak, etc.) -- Full zero-trust architecture -- Extensive compliance frameworks -- 24/7 SOC monitoring - -### Sweet Spot for Homelab Security -1. Strong firewall rules (Ubiquiti) -2. SSH keys + Fail2ban -3. Basic VLANs (trusted/IoT/guest) -4. Automated updates (NixOS) -5. Regular backups (encrypted) -6. Monitoring alerts (critical services only) - ---- - -**Remember:** Security is about risk management. Perfect security doesn't exist, but you can be secure enough for your threat model. For a homelab, focus on the high-impact, low-effort wins! 🎯 diff --git a/docs/SECURITY-SCANNING.md b/docs/SECURITY-SCANNING.md deleted file mode 100644 index 598f410..0000000 --- a/docs/SECURITY-SCANNING.md +++ /dev/null @@ -1,275 +0,0 @@ -# Security Scanning Guide - -**Last Updated:** November 2, 2025 - -This guide covers the secret scanning tools integrated into the NixOS dotfiles repository. - ---- - -## πŸ” Available Tools - -### git-secrets - -**Purpose:** Prevent committing secrets to git repositories -**Type:** Pre-commit hook + manual scanning -**Status:** βœ… Active (pre-commit hook installed) - -#### Features -- Prevents commits containing secrets -- Scans for AWS keys, passwords, tokens, SSH keys -- Runs automatically on every `git commit` -- Custom patterns for API keys, secrets, tokens - -#### Usage - -```bash -# Scan all files in the repository -git secrets --scan - -# Scan a specific file -git secrets --scan path/to/file - -# Scan git history (slower, thorough) -git secrets --scan-history - -# List configured patterns -git secrets --list - -# Add a custom pattern -git secrets --add 'pattern-regex' - -# Add an allowed pattern (whitelist) -git secrets --add --allowed 'safe-pattern' - -# Install hooks (already done via devenv) -git secrets --install -``` - -#### Configured Patterns - -**Blocked Patterns:** -- AWS credentials (Access Key ID, Secret Access Key, Account ID) -- Passwords: `password = "actual-value"` -- API keys: `api-key = "value"` -- Secrets: `secret = "value"` -- Tokens: `token = "value"` -- SSH keys: `ssh-rsa`, `ssh-ed25519` - -**Allowed Patterns (whitelisted):** -- `example.*password` - Example passwords in documentation -- `placeholder.*token` - Placeholder tokens -- `nixos@genesis` - Known safe identifier -- `your-wifi-password` - Template placeholder -- `password = "${syncPassword}"` - Nix variable references -- `password = "password"` - Default/placeholder value - ---- - -### TruffleHog - -**Purpose:** Deep scanning for secrets in git history -**Type:** Manual/CI scanning tool -**Status:** βœ… Available (version 3.90.9) - -#### Features -- Scans entire git history -- Verifies secrets against live APIs -- Detects 700+ secret types -- High accuracy, low false positives - -#### Usage - -```bash -# Scan entire repository (verified secrets only) -trufflehog git file://. --only-verified - -# Scan entire repository (including unverified) -trufflehog git file://. - -# Scan specific branch -trufflehog git file://. --branch main - -# Scan with JSON output -trufflehog git file://. --json - -# Scan since a specific commit -trufflehog git file://. --since-commit - -# Scan a remote repository -trufflehog git https://github.com/user/repo.git - -# Scan filesystem (not git) -trufflehog filesystem /path/to/scan -``` - -#### Best Practices - -1. **Regular Scanning:** Run weekly or after major changes -2. **Pre-release Scan:** Always scan before making repository public -3. **Historical Scan:** Run full history scan periodically -4. **CI Integration:** Add to GitHub Actions or CI pipeline - -#### Example Scan Results - -```bash -# Last scan: November 2, 2025 -trufflehog git file://. --only-verified -# Result: 2,207 chunks scanned -# Result: 0 verified secrets, 0 unverified secrets βœ… -``` - ---- - -## πŸ”§ Quick Reference Commands - -### Daily/Weekly Scans - -```bash -# Quick scan (git-secrets) -git secrets --scan - -# Deep scan (TruffleHog - verified only) -trufflehog git file://. --only-verified - -# Full scan (TruffleHog - all findings) -trufflehog git file://. -``` - -### Before Making Repo Public - -```bash -# Comprehensive scan with git-secrets -git secrets --scan-history - -# Comprehensive scan with TruffleHog -trufflehog git file://. --json > trufflehog-report.json -``` - -### Fixing Found Secrets - -If secrets are found: - -1. **Rotate the secret immediately** (change password, regenerate API key) -2. **Remove from git history:** - ```bash - # Option 1: BFG Repo Cleaner (recommended) - bfg --delete-files secret-file.txt - - # Option 2: git-filter-repo - git filter-repo --invert-paths --path path/to/secret - - # Option 3: Interactive rebase (for recent commits) - git rebase -i HEAD~10 - ``` -3. **Force push** (if necessary and coordinated with team) -4. **Verify removal:** - ```bash - git secrets --scan-history - trufflehog git file://. --only-verified - ``` - ---- - -## πŸš€ Automation - -### Pre-commit Hook (Already Active) - -Location: `.git/hooks/pre-commit` β†’ `scripts/git-hooks/pre-commit` - -```bash -#!/usr/bin/env bash -git secrets --pre_commit_hook -- "$@" -``` - -This hook runs automatically on every commit and blocks commits containing secrets. - -### Manual Hook Installation - -If the hook is missing or needs reinstallation: - -```bash -# Install git-secrets hooks -git secrets --install - -# Verify installation -ls -la .git/hooks/ | grep pre-commit -``` - -### CI/CD Integration (Future) - -Example GitHub Actions workflow: - -```yaml -name: Secret Scanning -on: [push, pull_request] - -jobs: - scan: - runs-on: ubuntu-latest - steps: - - uses: actions/checkout@v4 - with: - fetch-depth: 0 - - - name: TruffleHog Scan - uses: trufflesecurity/trufflehog@main - with: - path: ./ - base: ${{ github.event.repository.default_branch }} - head: HEAD - extra_args: --only-verified -``` - ---- - -## πŸ“Š Scan Status - -### Current Status (November 2, 2025) - -| Tool | Status | Last Scan | Findings | -|------|--------|-----------|----------| -| git-secrets | βœ… Active | Every commit | 0 secrets | -| TruffleHog | βœ… Available | Nov 2, 2025 | 0 verified secrets | - -### Scan Schedule - -- **git-secrets:** Automatic on every commit -- **TruffleHog:** Manual (recommended weekly) -- **Full history scan:** Monthly or before major releases - ---- - -## πŸ›‘οΈ Best Practices - -1. **Never commit secrets:** Use SOPS, age encryption, or environment variables -2. **Use templates:** Store templates in `secrets-example/` directory -3. **Rotate immediately:** If a secret is committed, rotate it ASAP -4. **Scan before sharing:** Always scan before making repo public or sharing -5. **Review patterns:** Regularly update git-secrets patterns -6. **Monitor CI:** Add secret scanning to CI/CD pipeline -7. **Educate team:** Ensure all contributors know about these tools - ---- - -## πŸ“š Related Documentation - -- [SECURITY.md](./SECURITY.md) - Overall security configuration -- [SECRETS.md](../SECRETS.md) - Secret management with SOPS -- [devenv.nix](../devenv.nix) - Development environment setup - ---- - -## πŸ”— External Resources - -- [git-secrets GitHub](https://github.com/awslabs/git-secrets) -- [TruffleHog GitHub](https://github.com/trufflesecurity/trufflehog) -- [SOPS Documentation](https://github.com/getsops/sops) -- [BFG Repo Cleaner](https://rtyley.github.io/bfg-repo-cleaner/) - ---- - -**Notes:** -- Both tools are integrated into the devenv shell -- git-secrets runs automatically on every commit -- TruffleHog should be run manually for deep scans -- All patterns are configurable in `devenv.nix` shellHook diff --git a/docs/SECURITY.md b/docs/SECURITY.md deleted file mode 100644 index 5a832a2..0000000 --- a/docs/SECURITY.md +++ /dev/null @@ -1,181 +0,0 @@ -# NixOS System Security Guide (Merged) - -This guide combines actionable configuration steps ("what"), rationale ("why"), and implementation details ("how") for securing any NixOS system. It is designed for clarity, completeness, and practical use. - -## Security Features Implemented - -### 1. Fail2ban - Automatic IP Blocking -**What:** -```nix -services.fail2ban = { - enable = true; - bantime = 3600; # 1 hour - findtime = 600; # 10 minutes - maxretry = 3; - ignoreip = [ "127.0.0.1/8" "192.168.1.0/24" ]; - jails = { - sshd = { - enabled = true; - port = "ssh"; - filter = "sshd"; - logpath = "/var/log/auth.log"; - maxretry = 3; - }; - }; -}; -``` -**Why:** Protects against brute force attacks by automatically banning IPs that fail authentication repeatedly. -**How:** Adjust ban time, detection window, and whitelisted networks as needed. - -### 2. Audit Logging - Security Event Monitoring -**What:** -```nix -security.auditd.enable = true; -security.audit.enable = true; -security.audit.rules = [ - "-w /etc/passwd -p wa -k identity" - "-w /etc/group -p wa -k identity" - "-w /etc/ssh/sshd_config -p wa -k ssh" - "-w /etc/sudoers -p wa -k sudo" - "-w /etc/crontab -p wa -k cron" - "-w /etc/systemd -p wa -k systemd" -]; -``` -**Why:** Provides a record of security-relevant events for monitoring, compliance, and forensics. -**How:** Monitor authentication, sudo, SSH config changes, user/group modifications, logins/logouts, and more. - -### 3. SSH Hardening -**What:** -```nix -services.openssh = { - enable = true; - passwordAuthentication = false; - permitRootLogin = "prohibit-password"; - maxAuthTries = 3; - clientAliveInterval = 300; # 5 minutes - clientAliveCountMax = 2; - protocol = 2; - allowTcpForwarding = false; - x11Forwarding = false; -}; -``` -**Why:** Reduces attack surface and enforces strong authentication. -**How:** Disable password authentication, restrict root access, limit attempts, set timeouts, and restrict protocols. - -### 4. Kernel Security Parameters -**What:** -```nix -boot.kernel.sysctl = { - "net.ipv4.ip_forward" = 0; - "net.ipv4.conf.all.accept_redirects" = 0; - "net.ipv4.conf.all.send_redirects" = 0; - "net.ipv4.conf.all.rp_filter" = 1; - "net.ipv4.tcp_syncookies" = 1; - "net.ipv4.tcp_max_syn_backlog" = 2048; - "net.ipv4.tcp_synack_retries" = 2; -}; -``` -**Why:** Protects against common network attacks and misconfigurations. -**How:** Set sysctl parameters to disable IP forwarding, ICMP redirects, source routing, and enable SYN flood protection. - -### 5. System Monitoring -**What:** -```nix -services.chrony.enable = true; -systemd.journald = { - maxRetentionSec = 2592000; # 1 month - maxFileSize = 1048576000; # 1GB - maxFiles = 10; -}; -``` -**Why:** Ensures logs are available for troubleshooting and security audits; keeps system time accurate for log integrity. -**How:** Configure journald for log rotation and retention, enable chronyd or another time sync service. - ---- - -## Deployment - -### Prerequisites -1. Ensure nixos-anywhere is installed: - ```bash - nix-env -iA nixpkgs.nixos-anywhere - ``` -2. Ensure SSH key is added to ssh-agent: - ```bash - ssh-add ~/.ssh/id_ed25519 - ``` - -## Verification & Maintenance - -After deploying or updating a system, always verify that security services are running and configured as expected: -```bash -sudo systemctl status fail2ban auditd -sudo fail2ban-client status -sudo auditctl -l -ssh -o PreferredAuthentications=password @ -ssh @ -``` - -## Troubleshooting - -#### Fail2ban Not Starting -- Check service status: `sudo systemctl status fail2ban` -- Check logs: `sudo journalctl -u fail2ban -n 50` -- Verify configuration: `sudo fail2ban-client -t` - -#### Audit Service Issues -- Check service status: `sudo systemctl status auditd` -- Check logs: `sudo journalctl -u auditd -n 50` -- Verify rules: `sudo auditctl -l` - -#### SSH Connection Issues -- Verify SSH service: `sudo systemctl status sshd` -- Check SSH configuration: `sudo sshd -T` -- Monitor auth logs: `sudo tail -f /var/log/auth.log` - -#### Service Missing After Deployment -- Check configuration syntax (build logs) -- Check service dependencies (systemd logs) -- Check package availability (should be auto-provided) - -## Security Monitoring - -### Log Locations & Maintenance -- **Fail2ban logs**: `journalctl -u fail2ban` -- **Audit logs**: `journalctl -u auditd` and `/var/log/audit/audit.log` -- **SSH logs**: `/var/log/auth.log` and `journalctl -u sshd` -- **System logs**: `journalctl -f` - -#### Regular Maintenance -- Review blocked IPs: `sudo fail2ban-client status sshd` -- Check audit events: `sudo ausearch -ts today` -- Monitor system logs: `sudo journalctl -f` -- Update system regularly: `sudo nixos-rebuild switch --flake .#` - -## Advanced Configuration - -### Custom Fail2ban Jails -```nix -services.fail2ban.jails = { - nginx-noscript = { - settings = { - enabled = true; - filter = "nginx-noscript"; - logpath = "/var/log/nginx/access.log"; - maxretry = 6; - }; - }; -}; -``` - -### Additional Audit Rules -```nix -security.audit.rules = [ - "-w /etc/crontab -p wa -k cron" - "-w /etc/systemd -p wa -k systemd" -]; -``` - ---- - -This guide provides enterprise-grade security best practices for any NixOS system, while maintaining ease of management through declarative configuration. diff --git a/docs/TODO-CHECKLIST.md b/docs/TODO-CHECKLIST.md deleted file mode 100644 index e4522c7..0000000 --- a/docs/TODO-CHECKLIST.md +++ /dev/null @@ -1,339 +0,0 @@ -# Implementation Checklist - -**Start Date:** October 2025 -**Last Updated:** October 29, 2025 -**Current Status:** Week 1 ~60% Complete (4/7 days done) - -Use this checklist to track your progress implementing improvements from the analysis. - ---- - -## πŸ“Š Quick Status Overview - -**βœ… Completed:** -- Deployment Safety (Day 1-2) - Scripts created & tested -- Just Automation (Day 3) - 20+ commands with rebuild-pre hook -- Documentation (Day 4) - PROJECT-OVERVIEW, ARCHITECTURE, ROADMAP-VISUAL updated -- Week 2: Secrets management with automatic sync -- Week 3: Cortex provisioning (RTX 5090 + Ollama operational) - -**❌ Not Started (Critical):** -- Automated Backups (Day 6-7) - **P1 HIGHEST PRIORITY** -- Core/Optional Architecture (Day 4-5) - P2 (blocked by planning) - -**🎯 Immediate Priorities:** - -1. **P1 (CRITICAL): Automated Backups** - 2 hours estimated - - Current: NO data protection for either system - - Impact: Risk of configuration/data loss - - Resources: Synology DS-920+ available but unused - - See: IMPLEMENTATION-GUIDE.md Day 6-7 for implementation - -2. **P2 (HIGH): Core/Optional Architecture** - 4 hours estimated - - Current: Flat module structure (base/hardware/services/programs) - - Impact: Scaling to 10+ systems, clearer organization - - Blocker: Need to create MIGRATION-PLAN.md first - - See: IMPLEMENTATION-GUIDE.md Day 3-5 for planning guide - -3. **P3 (MEDIUM): Integrate Pre-flight Scripts** - 1 hour estimated - - Current: Scripts exist but not default workflow - - Impact: Make safe-deploy.sh the standard method - - See: Update justfile and FLEET-MANAGEMENT.md - ---- - -## βœ… Week 1: Critical Improvements (DO THESE FIRST) - -### Day 1-2: Deployment Safety βœ… COMPLETED - -- [x] Create `scripts/pre-flight.sh` (from IMPLEMENTATION-GUIDE.md) -- [x] Create `scripts/validate.sh` (from IMPLEMENTATION-GUIDE.md) -- [x] Create `scripts/safe-deploy.sh` (from IMPLEMENTATION-GUIDE.md) -- [x] Make scripts executable: `chmod +x scripts/*.sh` -- [x] Test pre-flight on Cortex: `./scripts/pre-flight.sh cortex 192.168.1.7 jarvis` -- [x] Test validation on Cortex: `./scripts/validate.sh cortex 192.168.1.7 jarvis` -- [x] Do one safe deploy: `./scripts/safe-deploy.sh cortex 192.168.1.7 jarvis` -- [x] Update FLEET-MANAGEMENT.md with new workflow - -**Success Metric:** βœ… Scripts created and tested successfully - -**Status:** Scripts exist and work. Remaining: Integrate as default deployment method. - ---- - -### Day 3: Just Automation βœ… COMPLETED - -- [x] Add `just` to `environment.systemPackages` -- [x] Rebuild to install Just: `sudo nixos-rebuild switch --flake .#orion` -- [x] Create `justfile` in repo root (from IMPLEMENTATION-GUIDE.md) -- [x] Test: `just` (should show command list) -- [x] Test: `just rebuild-orion` -- [x] Test: `just check-cortex` -- [x] Test: `just deploy-cortex` -- [x] Update README.md with Just commands -- [x] Commit changes: `git add justfile && git commit -m "feat: add Just automation"` - -**Success Metric:** βœ… justfile created with 20+ commands including rebuild-pre hook - -**Status:** Fully operational. Added automatic secrets sync via rebuild-pre hook. - ---- - -### Day 4: Documentation βœ… COMPLETED | Core/Optional Planning ❌ NOT STARTED - -**Documentation (Day 4):** -- [x] Update PROJECT-OVERVIEW.md (October 29, 2025) -- [x] Create docs/ARCHITECTURE.md (500+ lines comprehensive guide) -- [x] Update QUICK-WINS.md with status tracking -- [x] Update ROADMAP-VISUAL.md with current progress -- [x] Consolidated secrets documentation into SECRETS.md - -**Success Metric:** βœ… Documentation comprehensive and accurate - -**Core/Optional Planning (Day 4 - alternative track):** -- [ ] Create `MIGRATION-PLAN.md` (from QUICK-WINS.md template) -- [ ] List all current system modules -- [ ] Mark each as [CORE] or [OPTIONAL] -- [ ] List all current home modules -- [ ] Mark each as [CORE] or [OPTIONAL] -- [ ] Review with fresh eyes - is this truly core? -- [ ] Schedule Day 5 for migration (2-3 hour block) - -**Status:** Documentation completed instead. Core/Optional planning not started yet. - ---- - -### Day 5: Core/Optional Migration ❌ NOT STARTED - -- [ ] Create directories: - ```bash - mkdir -p modules/system/{core,optional,users} - mkdir -p modules/home/{core,optional} - ``` -- [ ] Move system/base/* to system/core/ -- [ ] Move everything else to system/optional/ -- [ ] Create `modules/system/core/default.nix` with imports -- [ ] Move home/programs core files to home/core/ -- [ ] Move everything else to home/optional/ -- [ ] Update `systems/orion/default.nix` imports -- [ ] Update `systems/cortex/default.nix` imports -- [ ] Test rebuild Orion: `just rebuild-orion` -- [ ] Test deploy Cortex: `just deploy-cortex` -- [ ] Verify no regressions (check systemctl status) -- [ ] Commit: `git add -A && git commit -m "refactor: adopt core/optional architecture"` - -**Success Metric:** βœ… Both systems rebuild successfully with new structure - -**Status:** Blocked by Day 4 planning. Current structure is flat (base/hardware/services/programs). -**Priority:** P2 (after backups). Estimated 4 hours (1hr audit + 3hrs migration). -**Impact:** Enables scaling to 10+ systems, clearer module organization. - ---- - -### Day 6-7: Backup Setup ❌ NOT STARTED (HIGH PRIORITY) - -#### Day 6: Manual Backup Test - -- [ ] SSH into Synology: `ssh admin@synology.local` -- [ ] Create borg user on Synology -- [ ] Create backup directories on Synology -- [ ] Initialize Borg repo: `borg init --encryption=repokey-blake2 borg@synology.local:/volume1/backups/orion` -- [ ] Create test backup of ~/Documents -- [ ] Verify backup: `borg list borg@synology.local:/volume1/backups/orion` -- [ ] Delete test backup if successful -- [ ] Document Borg passphrase in password manager - -**Success Metric:** βœ… Manual backup to Synology works - -#### Day 7: Automated Backup Module - -- [ ] Create `modules/system/optional/services/backup.nix` (from IMPLEMENTATION-GUIDE.md) -- [ ] Add borg-passphrase to secrets.yaml -- [ ] Enable backup on Orion in default.nix -- [ ] Rebuild Orion: `just rebuild-orion` -- [ ] Verify service: `systemctl status borgbackup-job-synology.service` -- [ ] Manually trigger: `systemctl start borgbackup-job-synology.service` -- [ ] Check backup: `borg list borg@synology.local:/volume1/backups/orion` -- [ ] Enable backup on Cortex in default.nix -- [ ] Deploy to Cortex: `just deploy-cortex` -- [ ] Verify Cortex backup service - -**Success Metric:** βœ… Automated daily backups configured on both systems - -**Status:** NO AUTOMATED BACKUPS - Critical data protection gap! -**Priority:** P1 (HIGHEST). Estimated 2 hours total. -**Impact:** Currently no data protection for Orion or Cortex configurations. -**Resources:** Synology DS-920+ available but unused. See IMPLEMENTATION-GUIDE.md for backup.nix module. - ---- - -## 🟑 Week 2-4: High Priority Enhancements - -### Week 2: Documentation & Secrets βœ… COMPLETED - -- [x] Review `COMPARISON-ANALYSIS.md` fully -- [x] Review `QUICK-WINS.md` fully -- [x] Update PROJECT-OVERVIEW.md with new architecture -- [x] Create `.sops.yaml` with creation rules (see COMPARISON-ANALYSIS.md) -- [x] Add `just rekey` command to justfile -- [x] Documented complete secrets workflow in SECRETS.md -- [x] Test secret rekeying - -**Success Metric:** βœ… Secrets management workflow documented and operational - -**Status:** Complete with automatic sync via rebuild-pre hook. See SECRETS.md. - ---- - -### Week 3: Complete Cortex Provisioning βœ… COMPLETED - -- [x] Install NVIDIA drivers (open kernel modules for Blackwell) -- [x] Install CUDA toolkit (with uvm_disable_hmm=1 workaround) -- [x] Test RTX 5090 functionality (32GB VRAM accessible) -- [x] Install LLM frameworks (Ollama with 6 models) -- [x] Test GPU-accelerated LLM inference (working) -- [x] Document Cortex-specific setup (modules/system/ai-services/) -- [x] Create comprehensive AI services module - -**Success Metric:** βœ… Cortex fully operational with RTX 5090 + Ollama - -**Status:** Complete. Models loaded: llama3.2:3b, qwen2.5:7b, deepseek-r1:14b, qwen2.5-coder:32b, command-r:35b, mixtral:8x7b. -**Note:** Open WebUI temporarily disabled due to ctranslate2 build issues on NixOS unstable. - ---- - -### Week 4: YubiKey Integration ❌ NOT STARTED (Optional) - -- [ ] Order YubiKey if not already owned -- [ ] Study EmergentMind's `yubikey.nix` -- [ ] Create `modules/system/optional/yubikey.nix` -- [ ] Configure PAM for U2F -- [ ] Register YubiKey on Orion -- [ ] Test: sudo with YubiKey touch -- [ ] Test: SSH with YubiKey -- [ ] Register YubiKey on Cortex -- [ ] Document YubiKey setup process - -**Success Metric:** βœ… Touch-based sudo working on both systems - -**Status:** Optional security enhancement, not yet implemented. -**Priority:** Low (enhancement, not critical). -**Impact:** Adds physical 2FA for sudo and SSH. - ---- - -## 🟒 Month 2-3: Nice to Have - -### Month 2-3: Medium Priority Improvements - -**Custom Library:** -- [ ] Create `lib/` directory structure -- [ ] Implement host-specific helpers -- [ ] Add common configuration functions -- [ ] Update modules to use custom lib -- [ ] Document library functions - -**Testing & Validation:** -- [ ] Set up VM testing environment -- [ ] Test impermanence in disposable VM -- [ ] Create automated deployment tests -- [ ] Implement rollback procedures -- [ ] Document testing workflow - -**Stable Channel Integration (from m3tam3re):** -- [ ] Add nixpkgs-stable input to flake -- [ ] Create stable-packages overlay -- [ ] Document which services use stable vs unstable -- [ ] Plan to pin production services (Jellyfin, Frigate) to stable -- [ ] Test stable channel integration - ---- - -## πŸ“Š Progress Tracking - -### Week 1 Status (Updated: October 29, 2025) - -| Day | Task | Status | Notes | -|-----|------|--------|-------| -| 1 | Deployment Safety | βœ… | Scripts created & tested | -| 2 | Deployment Safety | βœ… | Integrated with justfile | -| 3 | Just Automation | βœ… | justfile operational with 20+ commands | -| 4 | Documentation | βœ… | PROJECT-OVERVIEW, ARCHITECTURE comprehensive | -| 4 | Core/Optional Plan | ❌ | Not started (alternative to docs) | -| 5 | Core/Optional Migration | ❌ | Blocked by Day 4 planning | -| 6 | Manual Backup Test | ❌ | HIGH PRIORITY - No backups! | -| 7 | Automated Backups | ❌ | HIGH PRIORITY - No backups! | - -**Legend:** ⬜ Not Started | 🟨 In Progress | βœ… Complete | ❌ Not Started/Blocked - -**Progress:** 4/7 days completed (~57%). Focus pivoted to documentation (Day 4) instead of Core/Optional planning. - -### Blockers & Issues - -- [ ] Issue 1: _______________________________ - - Impact: _______________________________ - - Resolution: _______________________________ - -- [ ] Issue 2: _______________________________ - - Impact: _______________________________ - - Resolution: _______________________________ - ---- - -## 🎯 Success Metrics Summary - -### Week 1 Goals (Updated October 29, 2025) -- βœ… Zero deployment failures in 5+ attempts -- βœ… All `just` commands working -- ❌ Clear core/optional separation (not started) -- ❌ Automated backups running daily (HIGH PRIORITY) - -**Progress:** 50% (2/4 goals). Backups are critical gap! - -### Month 1 Goals -- βœ… Complete Cortex provisioning (GPU, CUDA, LLMs) -- ❌ YubiKey integration (optional, not started) -- βœ… Updated documentation (comprehensive) -- βœ… Secrets automation with Just (rebuild-pre hook) - -**Progress:** 75% (3/4 goals). Documentation exceeds expectations. - -### Month 3 Goals -- ❌ Proxmox server operational -- ❌ Testing infrastructure (pre-commit) -- ❌ Offsite backups (optional) -- ❌ Monitoring stack (optional) - -**Progress:** 0% (0/4 goals). Not yet reached Month 3 phase. - ---- - -## πŸ“ Notes & Learnings - -### What Went Well - - -### What Could Be Improved - - -### Ideas for Future - - ---- - -## πŸ”— Quick Reference Links - -- [COMPARISON-ANALYSIS.md](./COMPARISON-ANALYSIS.md) - Detailed analysis & recommendations -- [IMPLEMENTATION-GUIDE.md](IMPLEMENTATION-GUIDE.md) - Implementation guides for Week 1 -- [PROJECT-OVERVIEW.md](PROJECT-OVERVIEW.md) - Your current architecture -- [FLEET-MANAGEMENT.md](./FLEET-MANAGEMENT.md) - Deployment workflows -- [SECRETS.md](./SECRETS.md) - Complete secrets management guide -- [EmergentMind's Config](https://github.com/EmergentMind/nix-config) - Reference implementation -- [EmergentMind's Anatomy Article](https://unmovedcentre.com/posts/anatomy-of-a-nixos-config/) - Core concepts - ---- - -**Last Updated:** ___________ -**Next Review:** ___________ - -**Remember:** Focus on P0 (Critical) items first. Don't try to do everything at once! diff --git a/docs/planning/GIT-HISTORY-ANALYSIS.md b/docs/planning/GIT-HISTORY-ANALYSIS.md new file mode 100644 index 0000000..2277c3a --- /dev/null +++ b/docs/planning/GIT-HISTORY-ANALYSIS.md @@ -0,0 +1,317 @@ +# Git History Analysis + +Generated: 2026-01-22 + +## Executive Summary + +This NixOS configuration repository spans **617 commits** over **14 months** (November 2024 - January 2026), evolving from a single-system setup to a **4-system fleet** with sophisticated infrastructure including flake-parts, deploy-rs, sops-nix secrets, and unified feature modules. + +--- + +## Repository Timeline + +| Date | Milestone | +|------|-----------| +| 2024-11-24 | Initial commit - basic NixOS configuration | +| 2025-08-31 | Renamed from "nixos" to **Orion** (primary workstation) | +| 2025-10-11 | Renamed "AIDA" to **Cortex** (AI/ML server) | +| 2025-11-09 | Renamed "HTPC" to **Axon** (home theater PC) | +| 2025-11-14 | Added **Nexus** (homelab server) | +| 2026-01-19 | Created `v1.0-pre-flake-parts` tag | +| 2026-01-20 | Migrated to **flake-parts** architecture | +| 2026-01-22 | Completed Phase 1-6 repository cleanup | + +--- + +## Commit Statistics + +### Total Commits: 617 + +### By Author +| Author | Commits | % | +|--------|---------|---| +| sygint | 379 | 61.4% | +| Sygint | 117 | 19.0% | +| syg | 107 | 17.3% | +| installer | 7 | 1.1% | +| Syg | 5 | 0.8% | +| copilot-swe-agent[bot] | 2 | 0.3% | + +*Note: sygint/Sygint/syg/Syg are likely the same user with different git configs* + +### By Month +| Month | Commits | +|-------|---------| +| Nov 2025 | 102 | +| Oct 2025 | 102 | +| Sep 2025 | 75 | +| Jan 2026 | 57 | +| Mar 2025 | 48 | +| Jun 2025 | 43 | +| Dec 2024 | 39 | +| Dec 2025 | 31 | +| Jan 2025 | 30 | +| Apr 2025 | 27 | +| May 2025 | 25 | +| Jul 2025 | 18 | +| Aug 2025 | 8 | +| Feb 2025 | 6 | +| Nov 2024 | 6 | + +### Busiest Days +| Date | Commits | Notes | +|------|---------|-------| +| 2025-11-02 | 40 | Major refactoring day | +| 2025-10-10 | 28 | Cortex development | +| 2025-03-23 | 26 | - | +| 2026-01-22 | 25 | Repository cleanup | +| 2025-07-06 | 16 | - | + +--- + +## System Evolution + +### Orion (Primary Workstation) +- **Created**: 2025-08-31 (renamed from generic "nixos") +- **Purpose**: Main development workstation with Hyprland desktop +- **Key commit**: `e6ee5e4` - "Change hostname from nixos to orion" +- **Most changed file**: `systems/orion/default.nix` (66 changes) + +### Cortex (AI/ML Server) +- **Created**: 2025-10-11 (renamed from "AIDA") +- **Purpose**: AI services, CUDA workloads +- **Key commit**: `3718de5` - "refactor: rename AIDA system to Cortex" +- **Changes**: 18 commits to `systems/cortex/default.nix` + +### Axon (Home Theater PC) +- **Created**: 2025-11-09 (renamed from "HTPC") +- **Purpose**: Media center, Kodi, Jellyfin +- **Key commit**: `2bf5a32` - "Migrate HTPC to axon" +- **Recent**: ZSH migration completed 2026-01-22 + +### Nexus (Homelab Server) +- **Created**: 2025-11-14 +- **Purpose**: Homelab services (Leantime, etc.) +- **Key commit**: `96b7ae8` - "feat(nexus): add complete homelab server" +- **Changes**: 14 commits to `systems/nexus/default.nix` + +--- + +## Major Refactoring Events + +### 1. Flake-Parts Migration (January 2026) +- **Tag**: `v1.0-pre-flake-parts` +- **Commits since migration**: 25 +- **Key commits**: + - `ed7b8df` - "refactor: migrate to flake-parts for modular flake composition (Phase 1)" + - `f2fe9ba` - "feat: migrate to unified feature modules (Phase 2)" + - `a73a823` - "refactor: migrate batch 2 modules to unified features - PHASE 2 COMPLETE" + +### 2. Dendritic Module Architecture (January 2026) +- Converted modules to auto-import pattern +- Created unified feature modules in `modules/features/` +- **Key commits**: + - `e00a0b5` - "feat(dendritic): add import-tree for automatic module imports" + - `28247a2` - "feat(dendritic): convert system modules to auto-import" + - `61ecb90` - "feat(dendritic): convert home modules to auto-import" + +### 3. Security Infrastructure (Various) +- sops-nix secrets management +- git-secrets and TruffleHog integration +- fail2ban and auditd hardening +- **Key commits**: + - `beef418` - "security: migrate Syncthing password to sops-nix secrets" + - `db0b07c` - "feat(security): integrate git-secrets and TruffleHog" + - `6dfe0c9` - "Add security hardening module with fail2ban and auditd" + +### 4. Repository Cleanup (January 2026) +Phases 1-6 completed: +- `ad1123a` - Phase 1: Network config consolidation +- `d7e76fc` - Phase 2: Documentation cleanup +- `bd004d4` - Phase 3: Temp file cleanup +- `f1e70f3` - Phase 4: fleet-config centralization +- `a728034` - Phase 5: Unused module deletion +- `6d505c6` - Phase 6: Kanboard script removal + +--- + +## Branch Analysis + +### Active Branches +| Branch | Last Commit | Status | +|--------|-------------|--------| +| `main` | 2026-01-22 | Primary branch, 6 commits ahead of origin | +| `feature/axon-zsh-migration` | 2026-01-22 | Likely merged | +| `feature/unified-zsh-module` | 2026-01-22 | Likely merged | +| `feature/phase-1-flake-parts` | 2026-01-20 | Likely merged | + +### Stale/Backup Branches +| Branch | Last Commit | Recommendation | +|--------|-------------|----------------| +| `backup/pre-flake-parts-migration` | 2026-01-19 | Keep as backup reference | +| `docs/flake-parts-dendritic-playbook` | 2026-01-19 | Can delete if merged | +| `origin/copilot/sub-pr-20` | 2026-01-22 | Review and delete | +| `origin/feature/add-gh-cli` | 2026-01-22 | Review and delete | + +### Tags +| Tag | Description | +|-----|-------------| +| `v1.0-pre-flake-parts` | State before flake-parts migration | +| `pre-cleanup-2026-01-22` | State before Phase 1-6 cleanup | + +--- + +## Most Changed Files + +| File | Changes | Notes | +|------|---------|-------| +| `systems/orion/default.nix` | 66 | Main system config | +| `flake.nix` | 65 | Flake root | +| `systems/nixos/default.nix` | 63 | Legacy (renamed to orion) | +| `flake.lock` | 41 | Dependency updates | +| `modules/home/programs/hyprland.nix` | 40 | Desktop config | +| `systems/orion/homes/syg.nix` | 40 | User home config | +| `hosts/nixos/home.nix` | 40 | Legacy home config | +| `home/syg.nix` | 40 | Legacy structure | + +--- + +## Large Files Issue + +### Current State +Two VM disk images exist in the repository root: +- `nexus.qcow2` - 751 MB (added ~2025-11-22) +- `orion.qcow2` - 688 MB (added ~2025-12-27) + +**Total**: 1.4 GB of binary data + +### Impact +- These files are NOT tracked in git history (likely in `.gitignore`) +- They exist in the working directory only +- Should be stored externally (NAS, cloud storage) + +### Recommendation +Delete from repository and document proper storage location. + +--- + +## Commit Message Patterns + +### Conventional Commits Adoption +The repository shows evolution toward conventional commits: + +**Early style** (2024-2025): +- "Add new feature" +- "Fix bug" +- "Update config" + +**Current style** (late 2025-2026): +- `feat:` - New features +- `fix:` - Bug fixes +- `refactor:` - Code restructuring +- `docs:` - Documentation +- `chore:` - Maintenance tasks +- `security:` - Security improvements + +### Scoped Commits +System-specific scopes emerged: +- `feat(nexus):` - Nexus-specific changes +- `feat(orion):` - Orion-specific changes +- `fix(dev):` - Development environment fixes +- `feat(dendritic):` - Architecture changes + +--- + +## Key Insights + +### 1. Rapid Growth Period +October-November 2025 saw 204 commits (33% of total), indicating major feature development and the addition of Cortex, Axon, and Nexus systems. + +### 2. Architecture Maturation +The migration to flake-parts and dendritic modules in January 2026 represents a significant maturation of the codebase, moving from ad-hoc configuration to a structured, maintainable architecture. + +### 3. Security Focus +Multiple commits dedicated to security infrastructure (sops-nix, fail2ban, secret scanning) show a conscious effort to secure the fleet. + +### 4. Cleanup Debt +The need for 6+ phases of cleanup in January 2026 indicates accumulated technical debt from rapid development. Future recommendation: regular maintenance sprints. + +### 5. Naming Evolution +Systems went through naming iterations: +- nixos -> Orion +- AIDA -> Cortex +- HTPC -> Axon + +This suggests the fleet naming convention ("constellation/neural network" theme) was established mid-project. + +--- + +## Recommendations + +### Immediate (Phase 7) - COMPLETED +1. ~~**Delete VM images** - Remove `*.qcow2` files (1.4 GB savings)~~ DONE +2. **Clean stale branches** - Merge or delete feature branches +3. ~~**Fix deploy.nix** - Use fleet-config for all IPs~~ DONE + +### Short-term +1. **Standardize git identity** - Consolidate author names +2. **Add branch protection** - Prevent direct pushes to main +3. **Automate cleanup** - Pre-commit hooks for file size limits + +### Long-term +1. **Regular maintenance** - Monthly cleanup sprints +2. **Version tagging** - Tag stable states before major changes +3. **Documentation sync** - Keep docs updated with each major change + +--- + +## TODO: Remote Repository Size Optimization + +**Date Added**: 2026-01-22 +**Status**: Pending further analysis + +### Background + +After local cleanup, we discovered the `.git` directory was **6.3 GB** due to orphaned blob objects (likely VM images that were accidentally committed and then removed). Running `git gc --prune=now --aggressive` reduced local `.git` to **12 MB**. + +### Orphaned Objects Found + +| Object Hash | Size | Likely Source | +|-------------|------|---------------| +| `882b7d70...` | 1.73 GB | VM disk image (nexus.qcow2?) | +| `8951683...` | 944 MB | VM disk image (orion.qcow2?) | + +These were **unreachable blobs** - not connected to any commit in history, but still stored in packfiles. + +### Local Fix Applied + +```bash +git reflog expire --expire=now --all +git gc --prune=now --aggressive +``` + +Result: `.git` reduced from 6.3 GB to 12 MB locally. + +### Remote Considerations + +The GitHub remote may still have bloated packfiles. Options to investigate: + +1. **Do nothing** - GitHub runs periodic gc, may clean up automatically +2. **Force push** - May trigger GitHub's repack process +3. **Contact GitHub support** - Request manual gc if clone size remains large +4. **Fresh repository** - Nuclear option: create new repo, push clean history + +### Before Taking Action + +1. Check current clone size: `git clone --bare && du -sh` +2. Compare with local size to confirm remote bloat +3. Review if any stale branches reference the large objects +4. Consider if history rewriting is needed (would require force push) + +### Risk Assessment + +- Force push to main requires coordination (branch protection) +- History rewriting affects all collaborators +- May lose references needed for debugging old issues + +**Recommendation**: Check remote clone size first. If acceptable, no action needed. diff --git a/docs/planning/TODO-NIXOS-FLEET-MIGRATION.md b/docs/planning/TODO-NIXOS-FLEET-MIGRATION.md new file mode 100644 index 0000000..3f97fd5 --- /dev/null +++ b/docs/planning/TODO-NIXOS-FLEET-MIGRATION.md @@ -0,0 +1,347 @@ +# TODO: Migrate to nixos-fleet + +**Date Created**: 2026-01-22 +**Status**: Not Started +**Priority**: Medium +**Estimated Effort**: 2-4 hours + +--- + +## Overview + +This repository should be using **nixos-fleet** (`~/Projects/open-source/nixos-fleet`) for deployment, but is currently using **deploy-rs** directly. nixos-fleet provides a unified CLI and uses Colmena under the hood. + +**Why migrate:** +- βœ… Unified `fleet` CLI instead of custom bash scripts +- βœ… Colmena-powered parallel deployments +- βœ… Tag-based targeting (`fleet push --tag servers`) +- βœ… Better secrets management integration +- βœ… ISO generation support +- βœ… Battle-tested patterns from your own project + +--- + +## Current State + +### What We Have Now +- **Deployment**: `deploy-rs` via `flake-modules/deploy.nix` +- **Fleet Management**: Custom `scripts/deployment/fleet.sh` (bash wrapper) +- **Fleet Config**: `fleet-config.nix` (good - compatible with nixos-fleet) +- **Systems**: Orion, Cortex, Nexus, Axon + +### What's Missing +- nixos-fleet not added as flake input +- No Colmena configuration +- No `fleet` CLI available +- Custom fleet.sh lacks secrets management + +--- + +## nixos-fleet Features + +From `~/Projects/open-source/nixos-fleet/README.md`: + +### CLI Commands +```bash +fleet push # Deploy updates (Colmena) +fleet install # Fresh install (nixos-anywhere) +fleet check # Health check +fleet iso # Generate installer ISO +fleet rekey # Rotate secrets +fleet --tag servers push # Deploy to tagged hosts +``` + +### Library Functions +```nix +nixos-fleet.lib.mkFleet # Main builder +nixos-fleet.lib.generateHosts +nixos-fleet.lib.generateColmena +``` + +### Modules +```nix +nixosModules.fleet-hosts # Auto-generate /etc/hosts +nixosModules.fleet-deploy-user # Setup deploy user +``` + +--- + +## Migration Steps + +### Phase 1: Add nixos-fleet Input (15 min) + +1. **Add flake input** to `flake.nix`: + ```nix + inputs = { + # ... existing inputs + nixos-fleet.url = "github:sygint/nixos-fleet"; + # Or local dev: nixos-fleet.url = "path:/home/syg/Projects/open-source/nixos-fleet"; + }; + ``` + +2. **Pass to flake-parts**: + ```nix + outputs = inputs@{ flake-parts, ... }: + flake-parts.lib.mkFlake { inherit inputs; } { + # ... + }; + ``` + +3. **Test**: `nix flake lock --update-input nixos-fleet` + +--- + +### Phase 2: Migrate to mkFleet (1-2 hours) + +#### Option A: Full Migration (Recommended) + +Replace entire flake structure with `nixos-fleet.lib.mkFleet`: + +**Before** (`flake.nix`): +```nix +outputs = inputs@{ flake-parts, ... }: + flake-parts.lib.mkFlake { inherit inputs; } { + imports = [ + ./flake-modules/nixos-configurations.nix + ./flake-modules/home-configurations.nix + ./flake-modules/deploy.nix + ]; + }; +``` + +**After**: +```nix +outputs = { nixpkgs, nixos-fleet, home-manager, ... }@inputs: + let + fleetConfig = import ./fleet-config.nix; + in + nixos-fleet.lib.mkFleet { + inherit inputs; + + fleet = { + network = fleetConfig.network; + hosts = fleetConfig.hosts; + }; + + # Per-host configurations + hostConfigurations = { + orion = ./systems/orion; + cortex = ./systems/cortex; + nexus = ./systems/nexus; + axon = ./systems/axon; + }; + }; +``` + +#### Option B: Gradual Migration + +Keep flake-parts but use nixos-fleet utilities: + +```nix +outputs = inputs@{ flake-parts, nixos-fleet, ... }: + let + fleetLib = nixos-fleet.lib; + fleetConfig = import ./fleet-config.nix; + in + flake-parts.lib.mkFlake { inherit inputs; } { + imports = [ + ./flake-modules/nixos-configurations.nix + ./flake-modules/home-configurations.nix + ]; + + # Replace deploy.nix with Colmena + flake.colmena = fleetLib.generateColmena { + inherit inputs; + fleet = { + network = fleetConfig.network; + hosts = fleetConfig.hosts; + }; + }; + }; +``` + +--- + +### Phase 3: Install Fleet CLI (5 min) + +Add to system packages or dev shell: + +```nix +# In systems/orion/default.nix or flake devShell +{ + environment.systemPackages = [ + inputs.nixos-fleet.packages.${system}.fleet + ]; +} +``` + +Or use directly: +```bash +nix run ~/Projects/open-source/nixos-fleet#fleet -- push cortex +``` + +--- + +### Phase 4: Update Scripts & Docs (30 min) + +1. **Update justfile** to use `fleet` CLI: + ```justfile + # Deploy to specific host + deploy-cortex: + fleet push cortex + + # Deploy to all servers + deploy-servers: + fleet push --tag server + ``` + +2. **Archive old fleet.sh**: + ```bash + mv scripts/deployment/fleet.sh scripts/deployment/archive/fleet.sh.old + ``` + +3. **Update documentation**: + - Update `docs/BOOTSTRAP.md` to reference `fleet install` + - Create `docs/FLEET-MANAGEMENT.md` with nixos-fleet commands + - Update README with new deployment workflow + +--- + +### Phase 5: Enable Fleet Modules (15 min) + +Add nixos-fleet modules to systems: + +```nix +# In flake-modules/nixos-configurations.nix +{ + imports = [ + inputs.nixos-fleet.nixosModules.fleet-hosts # Auto /etc/hosts + inputs.nixos-fleet.nixosModules.fleet-deploy-user # Deploy user setup + ]; +} +``` + +--- + +## Compatibility Check + +### What Works Already βœ… +- βœ… `fleet-config.nix` structure is compatible +- βœ… System configurations in `systems/*/` work as-is +- βœ… Home-manager configurations compatible +- βœ… sops-nix secrets work with nixos-fleet + +### What Needs Adjustment ⚠️ +- ⚠️ Custom `deploy.nix` β†’ Colmena configuration +- ⚠️ `scripts/deployment/fleet.sh` β†’ `fleet` CLI +- ⚠️ deploy-rs usage β†’ Colmena usage + +--- + +## Testing Strategy + +### 1. Test on Non-Critical Host First +Start with Axon (media center - least critical): + +```bash +# Build configuration +nix build .#nixosConfigurations.axon.config.system.build.toplevel + +# Test deploy (dry-run if available) +fleet check axon + +# Deploy +fleet push axon +``` + +### 2. Verify Health Checks +```bash +fleet check axon +ssh axon 'systemctl status' +``` + +### 3. Roll Out to Other Hosts +- Axon (done) β†’ Nexus (homelab) β†’ Cortex (AI) β†’ Orion (workstation) + +--- + +## Rollback Plan + +If migration fails: + +1. **Revert flake.nix**: + ```bash + git checkout HEAD~1 flake.nix + nix flake lock + ``` + +2. **Use old deploy-rs**: + ```bash + nix run github:serokell/deploy-rs -- .# + ``` + +3. **Restore fleet.sh**: + ```bash + git restore scripts/deployment/fleet.sh + ``` + +--- + +## Open Questions + +- [ ] Should we use Option A (full migration) or Option B (gradual)? +- [ ] Do we want local dev path or GitHub URL for nixos-fleet input? +- [ ] Should we add tags to hosts in fleet-config.nix? + ```nix + hosts.cortex.tags = [ "server" "ai" "remote" ]; + hosts.nexus.tags = [ "server" "homelab" "remote" ]; + hosts.axon.tags = [ "media" "local" ]; + hosts.orion.tags = [ "workstation" "local" ]; + ``` +- [ ] Do we want to use nixos-fleet's ISO generation for liveiso? + +--- + +## Benefits After Migration + +1. **Simplified Deployment**: `fleet push cortex` instead of `nix run github:serokell/deploy-rs -- .#cortex` +2. **Parallel Deploys**: `fleet push --tag server` deploys Cortex + Nexus simultaneously +3. **Better Tooling**: Built-in health checks, secrets management +4. **Dogfooding**: Use your own project in production +5. **Unified Workflow**: Same commands across all operations + +--- + +## Next Steps + +1. **Immediate** (today): + - [ ] Decide on Option A vs Option B + - [ ] Add nixos-fleet as flake input + - [ ] Test `nix build` still works + +2. **This Week**: + - [ ] Implement chosen migration option + - [ ] Test deploy on Axon + - [ ] Update documentation + +3. **This Month**: + - [ ] Migrate all hosts + - [ ] Archive old fleet.sh + - [ ] Add tags to fleet-config.nix + - [ ] Create blog post about the migration + +--- + +## References + +- **nixos-fleet**: `~/Projects/open-source/nixos-fleet` +- **nixos-fleet README**: `~/Projects/open-source/nixos-fleet/README.md` +- **Current fleet-config**: `fleet-config.nix` +- **Current deploy config**: `flake-modules/deploy.nix` +- **Colmena docs**: https://colmena.cli.rs/ + +--- + +## Updates + +**2026-01-22**: Initial migration plan created. Currently using deploy-rs directly. diff --git a/docs/planning/TODO-WAYLAND-ALTERNATIVES.md b/docs/planning/TODO-WAYLAND-ALTERNATIVES.md new file mode 100644 index 0000000..d2b464e --- /dev/null +++ b/docs/planning/TODO-WAYLAND-ALTERNATIVES.md @@ -0,0 +1,269 @@ +# TODO: Explore Alternative Wayland Compositors and Shells + +**Date Created**: 2026-01-22 +**Status**: Research Phase +**Priority**: Low (Exploratory) + +--- + +## Overview + +Explore modern Wayland compositors and shell frameworks as potential alternatives or complements to the current Hyprland setup. Focus on innovative UI/UX approaches and unique features. + +--- + +## Projects to Evaluate + +### 1. Niri +**Repository**: https://github.com/YaLTeR/niri +**Type**: Scrollable-tiling Wayland compositor + +#### Key Features +- Scrollable tiling: windows arranged in columns that can be scrolled infinitely +- Written in Rust +- Inspired by PaperWM (GNOME Shell extension) +- Dynamic workspaces +- Built on Smithay (Wayland compositor library) + +#### Why Interesting +- Novel window management paradigm (horizontal scrolling instead of traditional tiling) +- Modern Rust codebase +- Might suit ultra-wide displays well +- Different mental model from Hyprland's traditional tiling + +#### Questions to Answer +- [ ] How does scrollable tiling compare to traditional tiling for productivity? +- [ ] Does it work well with multi-monitor setups? +- [ ] What's the learning curve compared to Hyprland? +- [ ] Is it stable enough for daily use? +- [ ] Does it have feature parity with Hyprland (animations, blur, etc.)? + +#### NixOS Integration +- [ ] Check nixpkgs for `niri` package availability +- [ ] Look for existing NixOS modules or flakes +- [ ] Assess configuration approach (config files vs NixOS options) + +--- + +### 2. Noctalia Shell +**Repository**: (Need to verify exact project) +**Type**: Desktop shell/environment + +#### Research Needed +- [ ] Find official repository/documentation +- [ ] Determine project status (active/abandoned) +- [ ] Identify key differentiators +- [ ] Check if it's a full compositor or just a shell layer +- [ ] Verify Wayland compatibility + +#### Questions +- [ ] Is this a complete desktop environment or modular shell? +- [ ] What compositor does it use (or is it its own)? +- [ ] Target audience/use case? + +--- + +### 3. QuickShell +**Repository**: https://git.outfoxxed.me/outfoxxed/quickshell +**Type**: QtQuick-based Wayland shell toolkit + +#### Key Features +- Uses QtQuick/QML for shell components +- Modular approach to building custom shells +- Declarative UI with QML +- Hot-reloadable configurations +- Compositor-agnostic (works with any wlroots-based compositor) + +#### Why Interesting +- Similar concept to Ags (Aylur's GTK Shell) but with Qt instead of GTK +- QML provides powerful UI capabilities +- Could create custom bars, panels, widgets with familiar tech stack +- Hot-reload during development + +#### Questions to Answer +- [ ] How mature is the project? +- [ ] Performance compared to Waybar/Ags? +- [ ] What's the learning curve for QML? +- [ ] Can it integrate with existing Hyprland setup? +- [ ] Does it have good documentation/examples? + +#### Use Cases +- Custom status bars with rich widgets +- Dynamic notification systems +- App launchers with custom UIs +- System monitors with data visualization + +#### NixOS Integration +- [ ] Check if packaged in nixpkgs +- [ ] Look for community flakes +- [ ] Assess if it can be packaged easily +- [ ] Test alongside current Waybar setup + +--- + +### 4. Mangowc (Mango Wayland Compositor) +**Repository**: (Need to verify - possibly MangoHud related or separate project) +**Type**: Unclear - needs research + +#### Research Needed +- [ ] Verify this is a real project (might be confused with MangoHud?) +- [ ] Find official repository +- [ ] Determine scope and goals +- [ ] Check project status and activity +- [ ] Identify if related to MangoHud (performance overlay tool) + +#### Possibilities +- Could be a performance-focused compositor? +- Might be a gaming-oriented Wayland compositor? +- May have special integration with MangoHud overlay? + +**Note**: This needs immediate clarification - searching for "mangowc" yields unclear results. Might need to: +- Check if this was a typo or misremembered name +- Look for similar-sounding projects +- Verify in Wayland compositor listings + +--- + +## Evaluation Criteria + +When testing each project, assess: + +### Technical +- [ ] **Stability**: Crash frequency, memory leaks, performance +- [ ] **Feature completeness**: Gaps compared to Hyprland +- [ ] **Performance**: Frame times, latency, resource usage +- [ ] **Multi-monitor support**: How well does it handle multiple displays? +- [ ] **HiDPI support**: Scaling on mixed-DPI setups + +### Usability +- [ ] **Configuration**: Declarative vs imperative, restart required? +- [ ] **Keybindings**: Flexibility, conflicts with existing muscle memory +- [ ] **Customization**: Theming, plugins, extensibility +- [ ] **Documentation**: Quality, completeness, examples + +### Ecosystem +- [ ] **NixOS integration**: Packaged? Module available? Flake support? +- [ ] **Community**: Active development, responsive maintainers +- [ ] **Compatibility**: Works with existing tools (Waybar, rofi, etc.) + +### Migration +- [ ] **Learning curve**: Time to productivity +- [ ] **Config migration**: Can existing configs be adapted? +- [ ] **Fallback plan**: Easy to revert to Hyprland? + +--- + +## Testing Strategy + +### Phase 1: Research (1-2 hours per project) +1. Read documentation and READMEs +2. Watch demo videos/screenshots +3. Check issue trackers for known problems +4. Review NixOS community discussions +5. Assess project health (last commit, release cadence) + +### Phase 2: VM Testing (2-4 hours per project) +1. Set up test VM with minimal NixOS config +2. Install compositor/shell +3. Test basic functionality +4. Document pain points and highlights +5. Screenshot/record interesting features + +### Phase 3: Evaluation (1 hour per project) +1. Score against evaluation criteria +2. Document pros/cons vs Hyprland +3. Identify potential integration opportunities +4. Decide: adopt, integrate partially, or skip + +### Phase 4: Integration (if promising) +1. Create feature module in `modules/features/` +2. Add to appropriate system (Orion for testing) +3. Run in parallel with Hyprland for comparison +4. Iterate on configuration + +--- + +## Current Stack Context + +### What We Use Now +- **Compositor**: Hyprland (feature-rich, animated, tiling) +- **Bar**: Waybar (customizable, well-integrated) +- **Launcher**: Rofi (Wayland fork) +- **Notifications**: Mako +- **Lock screen**: Hyprlock +- **Idle management**: Hypridle + +### Integration Considerations +- Can new tools work alongside Hyprland? (e.g., QuickShell replacing Waybar) +- Do we need to replace entire compositor or just components? +- Will existing keybindings/workflows transfer? +- How much config work is required? + +--- + +## Success Criteria + +A project is worth deeper integration if it: + +1. **Solves a current pain point** or provides meaningful improvement +2. **Maintains stability** (no frequent crashes or data loss) +3. **Has active development** (commits in last 3 months) +4. **Integrates with NixOS** (packaged or easily packageable) +5. **Offers unique value** (not just "different but equal") + +--- + +## Resources + +### General Wayland Resources +- [Awesome Wayland](https://github.com/natpen/awesome-wayland) - Curated list +- [Are We Wayland Yet?](https://arewewaylandyet.com/) - Compatibility tracker +- [Wayland Book](https://wayland-book.com/) - Protocol deep dive + +### Community +- r/Hyprland - May have comparison discussions +- r/NixOS - NixOS-specific integration help +- Various project Discord servers (check each repo) + +### Testing Infrastructure +- Use `systemd-nspawn` or VMs for isolated testing +- Keep Hyprland as default, test alternatives in secondary sessions +- Document configs in `modules/features/` even if not enabled + +--- + +## Next Steps + +1. **Immediate** (30 min): + - [ ] Verify "mangowc" is the correct project name + - [ ] Find correct URLs for all projects + - [ ] Check nixpkgs for existing packages: `nix search nixpkgs niri quickshell` + +2. **This Week** (2-4 hours): + - [ ] Phase 1 research for Niri (most promising based on initial interest) + - [ ] Phase 1 research for QuickShell (could complement Hyprland) + +3. **This Month** (4-8 hours): + - [ ] VM testing for 1-2 most promising options + - [ ] Document findings in this file + - [ ] Create feature module if something is production-ready + +4. **Future**: + - [ ] Consider blog post or documentation comparing options + - [ ] Share findings with NixOS community + - [ ] Contribute nixpkgs packages if needed + +--- + +## Notes + +- This is exploratory research, not a commitment to switch +- Hyprland is working well; this is about learning alternatives +- May discover tools useful for specific systems (e.g., kiosk on Nexus, minimal shell on Axon) +- QuickShell might be most immediately useful (custom widgets without changing compositor) + +--- + +## Updates + +**2026-01-22**: Initial research doc created. Need to verify project URLs and start Phase 1 research. diff --git a/docs/planning/analysis/nixos-fleet-readiness.md b/docs/planning/analysis/nixos-fleet-readiness.md new file mode 100644 index 0000000..2496cb8 --- /dev/null +++ b/docs/planning/analysis/nixos-fleet-readiness.md @@ -0,0 +1,263 @@ +# nixos-fleet Project Analysis + +**Date**: 2026-01-22 +**Location**: `~/Projects/open-source/nixos-fleet` + +--- + +## Executive Summary + +**Status**: βœ… **Production Ready (v1.0.0)** + +nixos-fleet is a **mature, fully functional** Go CLI tool that wraps Colmena for NixOS fleet management. It's ready for use in this dotfiles repo. + +--- + +## Project Health + +| Metric | Status | Notes | +|--------|--------|-------| +| **Version** | v1.0.0 | Released Dec 1, 2025 | +| **Commits** | 22 | Focused development | +| **Last Commit** | 2025-12-01 | Recent (7 weeks ago) | +| **Flake Check** | βœ… Passes | All checks green | +| **CLI** | βœ… Works | Full help output available | +| **Tests** | βœ… Present | Go tests + shell tests | +| **Template** | βœ… Complete | Default template ready | +| **Git Status** | ⚠️ Dirty | Uncommitted changes | + +--- + +## Feature Completeness + +### Core Features (All Implemented βœ…) + +#### Deployment +- βœ… `fleet push ` - Deploy updates via Colmena +- βœ… `fleet push all` - Deploy to all hosts +- βœ… `fleet push --tag server` - Tag-based targeting +- βœ… `fleet install ` - Fresh install (nixos-anywhere) +- βœ… `fleet rollback ` - Rollback to previous generation + +#### Observability +- βœ… `fleet status` - Fleet overview +- βœ… `fleet check ` - Health checks +- βœ… `fleet ssh ` - SSH wrapper +- βœ… `fleet exec -- ` - Remote execution + +#### Secrets Management +- βœ… `fleet secrets edit` - Edit secrets +- βœ… `fleet secrets view` - View secrets +- βœ… `fleet secrets validate` - Validate encryption + +#### ISO Management +- βœ… `fleet iso build` - Build installer ISO +- βœ… `fleet iso flash ` - Flash to USB +- βœ… `fleet iso list` - List devices + +#### Configuration +- βœ… `fleet config init` - Interactive setup +- βœ… `fleet config show` - Show config +- βœ… `fleet config set/get` - Manage settings + +#### Maintenance +- βœ… `fleet update` - Update flake inputs +- βœ… `fleet gc [safe|aggressive]` - Garbage collection +- βœ… `fleet vm ` - VM testing + +--- + +## Technology Stack + +- **Language**: Go 1.21+ +- **CLI Framework**: Custom (fatih/color for output) +- **Deployment**: Colmena (wrapped) +- **Install**: nixos-anywhere (wrapped) +- **Secrets**: SOPS + Age +- **Testing**: Go tests + shell scripts +- **Build**: Nix flake with buildGoModule + +--- + +## Outstanding Work (ROADMAP.md) + +### High Priority (Core UX) +- [ ] **P0** - Add fleet alias to dotfiles +- [ ] **P0** - Auto-detect FLEET_FLAKE_DIR +- [ ] **P1** - `fleet logs ` - Remote journalctl +- [ ] **P1** - `fleet reboot ` - Safe reboot + +### Medium Priority (Polish) +- [ ] **P2** - Parallel status checks +- [ ] **P2** - `--dry-run` for push +- [ ] **P2** - `fleet diff ` +- [ ] **P2** - Tab completion +- [ ] **P2** - Config validation + +### Nice to Have +- [ ] **P3** - Wake-on-LAN integration +- [ ] **P3** - Deployment history +- [ ] **P3** - Notifications (ntfy/Discord) +- [ ] **P3** - Dashboard web UI + +--- + +## Readiness Assessment + +### For Use in Dotfiles Repo + +| Category | Ready? | Notes | +|----------|--------|-------| +| **Core Deployment** | βœ… Yes | push/install/rollback work | +| **Multi-Host** | βœ… Yes | Tag-based targeting ready | +| **Secrets** | βœ… Yes | SOPS integration complete | +| **Stability** | βœ… Yes | v1.0.0 released | +| **Documentation** | ⚠️ Medium | README good, needs more examples | +| **Testing** | βœ… Yes | Unit + integration tests | +| **Migration Path** | βœ… Yes | Template shows how to integrate | + +### Blockers: NONE βœ… + +All core features needed for dotfiles migration are implemented and working. + +--- + +## Migration Recommendation + +**Verdict**: βœ… **MIGRATE NOW** + +### Why Migrate: +1. **Feature Complete**: All v1.0 features implemented +2. **Battle Tested**: Released and stable +3. **Better UX**: Unified CLI vs custom bash scripts +4. **Dogfooding**: Use your own tool in production +5. **Active Development**: Recent commits, clear roadmap + +### Why Wait: +- ❌ None - no significant blockers + +--- + +## Migration Plan Priority + +### Immediate (Today) - 1 hour +1. Clean up uncommitted changes in nixos-fleet repo +2. Add nixos-fleet as flake input to dotfiles +3. Test that builds still work + +### This Week - 2-3 hours +1. Implement Option A (full mkFleet migration) or Option B (gradual) +2. Test `fleet push` on Axon (lowest risk) +3. Update justfile to use `fleet` commands + +### This Month +1. Migrate all hosts to fleet CLI +2. Archive old fleet.sh script +3. Add P0/P1 features to nixos-fleet (auto-detect, logs) + +--- + +## Comparison: Current vs Post-Migration + +| Task | Current Method | With nixos-fleet | +|------|----------------|------------------| +| **Deploy Cortex** | `nix run github:serokell/deploy-rs -- .#cortex` | `fleet push cortex` | +| **Deploy All Servers** | Manual: deploy Cortex, then Nexus | `fleet push --tag server` | +| **Health Check** | `./scripts/deployment/fleet.sh check cortex` | `fleet check cortex` | +| **Fresh Install** | `./scripts/bootstrap-automated.sh cortex 192.168.1.7` | `fleet install cortex` | +| **Run Command** | `ssh jarvis@cortex 'systemctl status'` | `fleet exec cortex -- systemctl status` | +| **Secrets** | Manual sops edit | `fleet secrets edit` | +| **Fleet Status** | Manual checks | `fleet status` | + +**Savings**: 50-70% fewer keystrokes, unified commands, better UX. + +--- + +## Risks & Mitigation + +### Risk 1: Breaking Changes +**Likelihood**: Low +**Impact**: Medium +**Mitigation**: Test on Axon first, keep deploy-rs input as fallback + +### Risk 2: Learning Curve +**Likelihood**: Low (you built it!) +**Impact**: Low +**Mitigation**: Document new commands in dotfiles docs + +### Risk 3: Bugs in nixos-fleet +**Likelihood**: Medium (new tool) +**Impact**: Medium +**Mitigation**: Fix bugs in nixos-fleet repo, easy to iterate + +--- + +## Action Items + +### For nixos-fleet Repo +- [ ] Commit uncommitted changes +- [ ] Tag current state as v1.0.1 (if changes are meaningful) +- [ ] Push to GitHub (if public) +- [ ] Consider adding to nixpkgs (future) + +### For Dotfiles Repo +- [ ] Add nixos-fleet as flake input +- [ ] Choose migration strategy (Option A vs B) +- [ ] Test on Axon +- [ ] Update all documentation +- [ ] Archive old scripts + +### Documentation Strategy +- [ ] **DELETE**: `docs/FLEET-SECRETS-INTEGRATION.md` (outdated fleet.sh docs) +- [ ] **UPDATE**: `docs/FLEET-FUTURE.md` β†’ Mention using nixos-fleet instead of direct Colmena +- [ ] **CREATE**: `docs/FLEET-NIXOS-FLEET.md` - How to use fleet CLI +- [ ] **KEEP**: `docs/GLOBAL-SETTINGS.md` (fleet-config.nix still relevant) + +--- + +## Recommendation Summary + +**For Documentation Cleanup:** +1. ❌ **DELETE** `docs/FLEET-SECRETS-INTEGRATION.md` - Documents non-existent fleet.sh commands +2. ✏️ **UPDATE** `docs/FLEET-FUTURE.md` - Add note: "Update: We built nixos-fleet to solve this" +3. βœ… **KEEP** `docs/GLOBAL-SETTINGS.md` - Still relevant + +**For Migration:** +1. βœ… **START MIGRATION** - nixos-fleet is production ready +2. πŸ“… **Timeline**: 1 hour today, 2-3 hours this week +3. 🎯 **Priority**: Medium-High (improves workflow significantly) + +--- + +## Questions to Decide + +1. **Local vs GitHub for flake input?** + - Local dev: `nixos-fleet.url = "path:/home/syg/Projects/open-source/nixos-fleet"` + - GitHub: `nixos-fleet.url = "github:sygint/nixos-fleet"` + - **Recommendation**: Start with local path for easy iteration + +2. **Full migration (Option A) or gradual (Option B)?** + - Option A: Replace entire flake structure with mkFleet + - Option B: Keep flake-parts, just use Colmena instead of deploy-rs + - **Recommendation**: Option B for safety (less disruptive) + +3. **Add tags to fleet-config.nix now or later?** + - Example: `cortex.tags = ["server" "ai" "remote"]` + - **Recommendation**: Add now, enables `fleet push --tag server` + +--- + +## Next Step + +**My Recommendation**: + +Start migration **today** with these steps: + +1. Clean up nixos-fleet repo (commit changes) +2. Add as local flake input to dotfiles +3. Test `nix build` still works +4. Try `nix run .#fleet -- status` to verify integration + +This is **low risk** (just adding input) and sets up for full migration this week. + +**Want to proceed?** diff --git a/flake-modules/deploy.nix b/flake-modules/deploy.nix index c5a455d..5eee3d2 100644 --- a/flake-modules/deploy.nix +++ b/flake-modules/deploy.nix @@ -1,26 +1,30 @@ { self, inputs, ... }: +let + fleetConfig = import ../fleet-config.nix; + hosts = fleetConfig.hosts; +in { flake.deploy = { sshUser = "jarvis"; # Global SSH user for all nodes nodes = { cortex = { - hostname = "192.168.1.7"; # TODO: Switch to cortex.home when DNS is fixed + hostname = hosts.cortex.ip; profiles.system = { path = inputs.deploy-rs.lib.x86_64-linux.activate.nixos self.nixosConfigurations.cortex; user = "root"; # Activate as root (via sudo) }; }; nexus = { - hostname = "192.168.1.22"; # Nexus homelab services server - sshUser = "admin"; # Override global SSH user for Nexus + hostname = hosts.nexus.ip; + sshUser = hosts.nexus.ssh.user; # Override global SSH user for Nexus profiles.system = { path = inputs.deploy-rs.lib.x86_64-linux.activate.nixos self.nixosConfigurations.nexus; user = "root"; # Activate as root (via sudo) }; }; axon = { - hostname = "192.168.1.11"; # TODO: Update with actual Axon IP + hostname = hosts.axon.ip; profiles.system = { path = inputs.deploy-rs.lib.x86_64-linux.activate.nixos self.nixosConfigurations.axon; user = "root"; # Activate as root (via sudo) diff --git a/flake.lock b/flake.lock index 7889167..ced2bcc 100644 --- a/flake.lock +++ b/flake.lock @@ -101,6 +101,31 @@ "type": "github" } }, + "colmena": { + "inputs": { + "flake-compat": "flake-compat_4", + "flake-utils": "flake-utils_2", + "nix-github-actions": "nix-github-actions", + "nixpkgs": [ + "nixos-fleet", + "nixpkgs" + ], + "stable": "stable" + }, + "locked": { + "lastModified": 1762034856, + "narHash": "sha256-QVey3iP3UEoiFVXgypyjTvCrsIlA4ecx6Acaz5C8/PQ=", + "owner": "zhaofengli", + "repo": "colmena", + "rev": "349b035a5027f23d88eeb3bc41085d7ee29f18ed", + "type": "github" + }, + "original": { + "owner": "zhaofengli", + "repo": "colmena", + "type": "github" + } + }, "crane": { "locked": { "lastModified": 1758758545, @@ -169,6 +194,27 @@ "type": "github" } }, + "disko_2": { + "inputs": { + "nixpkgs": [ + "nixos-fleet", + "nixpkgs" + ] + }, + "locked": { + "lastModified": 1764627417, + "narHash": "sha256-D6xc3Rl8Ab6wucJWdvjNsGYGSxNjQHzRc2EZ6eeQ6l4=", + "owner": "nix-community", + "repo": "disko", + "rev": "5a88a6eceb8fd732b983e72b732f6f4b8269bef3", + "type": "github" + }, + "original": { + "owner": "nix-community", + "repo": "disko", + "type": "github" + } + }, "fenix": { "inputs": { "nixpkgs": [ @@ -271,6 +317,22 @@ "url": "https://flakehub.com/f/edolstra/flake-compat/1.tar.gz" } }, + "flake-compat_4": { + "flake": false, + "locked": { + "lastModified": 1650374568, + "narHash": "sha256-Z+s0J8/r907g149rllvwhb4pKi8Wam5ij0st8PwAh+E=", + "owner": "edolstra", + "repo": "flake-compat", + "rev": "b4a34015c698c7793d592d66adbab377907a2be8", + "type": "github" + }, + "original": { + "owner": "edolstra", + "repo": "flake-compat", + "type": "github" + } + }, "flake-parts": { "inputs": { "nixpkgs-lib": "nixpkgs-lib" @@ -346,6 +408,21 @@ "type": "github" } }, + "flake-utils_2": { + "locked": { + "lastModified": 1659877975, + "narHash": "sha256-zllb8aq3YO3h8B/U0/J1WBgAL8EX5yWf5pMj3G0NAmc=", + "owner": "numtide", + "repo": "flake-utils", + "rev": "c0e246b9b83f637f4681389ecabcb2681b4f3af0", + "type": "github" + }, + "original": { + "owner": "numtide", + "repo": "flake-utils", + "type": "github" + } + }, "fromYaml": { "flake": false, "locked": { @@ -796,6 +873,28 @@ "type": "github" } }, + "nix-github-actions": { + "inputs": { + "nixpkgs": [ + "nixos-fleet", + "colmena", + "nixpkgs" + ] + }, + "locked": { + "lastModified": 1729742964, + "narHash": "sha256-B4mzTcQ0FZHdpeWcpDYPERtyjJd/NIuaQ9+BV1h+MpA=", + "owner": "nix-community", + "repo": "nix-github-actions", + "rev": "e04df33f62cdcf93d73e9a04142464753a16db67", + "type": "github" + }, + "original": { + "owner": "nix-community", + "repo": "nix-github-actions", + "type": "github" + } + }, "nix-snapd": { "inputs": { "flake-compat": "flake-compat_3", @@ -817,6 +916,62 @@ "url": "https://flakehub.com/f/io12/nix-snapd/0.1.47.tar.gz" } }, + "nixlib": { + "locked": { + "lastModified": 1736643958, + "narHash": "sha256-tmpqTSWVRJVhpvfSN9KXBvKEXplrwKnSZNAoNPf/S/s=", + "owner": "nix-community", + "repo": "nixpkgs.lib", + "rev": "1418bc28a52126761c02dd3d89b2d8ca0f521181", + "type": "github" + }, + "original": { + "owner": "nix-community", + "repo": "nixpkgs.lib", + "type": "github" + } + }, + "nixos-fleet": { + "inputs": { + "colmena": "colmena", + "disko": "disko_2", + "nixos-generators": "nixos-generators", + "nixpkgs": "nixpkgs_6", + "sops-nix": "sops-nix" + }, + "locked": { + "lastModified": 1769103646, + "narHash": "sha256-2mwYINb9gRjZCFQNGUaH0C25KoqP2AgEUuF/YG/nJPo=", + "path": "/home/syg/Projects/open-source/nixos-fleet", + "type": "path" + }, + "original": { + "path": "/home/syg/Projects/open-source/nixos-fleet", + "type": "path" + } + }, + "nixos-generators": { + "inputs": { + "nixlib": "nixlib", + "nixpkgs": [ + "nixos-fleet", + "nixpkgs" + ] + }, + "locked": { + "lastModified": 1764234087, + "narHash": "sha256-NHF7QWa0ZPT8hsJrvijREW3+nifmF2rTXgS2v0tpcEA=", + "owner": "nix-community", + "repo": "nixos-generators", + "rev": "032a1878682fafe829edfcf5fdfad635a2efe748", + "type": "github" + }, + "original": { + "owner": "nix-community", + "repo": "nixos-generators", + "type": "github" + } + }, "nixos-hardware": { "locked": { "lastModified": 1768736227, @@ -889,6 +1044,22 @@ "url": "https://github.com/NixOS/nixpkgs/archive/50eb7ecf4cd0a5756d7275c8ba36790e5bd53e33.tar.gz" } }, + "nixpkgs_10": { + "locked": { + "lastModified": 1768127708, + "narHash": "sha256-1Sm77VfZh3mU0F5OqKABNLWxOuDeHIlcFjsXeeiPazs=", + "owner": "nixos", + "repo": "nixpkgs", + "rev": "ffbc9f8cbaacfb331b6017d5a5abb21a492c9a38", + "type": "github" + }, + "original": { + "owner": "nixos", + "ref": "nixos-unstable", + "repo": "nixpkgs", + "type": "github" + } + }, "nixpkgs_2": { "locked": { "lastModified": 1763835633, @@ -953,11 +1124,11 @@ }, "nixpkgs_6": { "locked": { - "lastModified": 1768886240, - "narHash": "sha256-C2TjvwYZ2VDxYWeqvvJ5XPPp6U7H66zeJlRaErJKoEM=", + "lastModified": 1764517877, + "narHash": "sha256-pp3uT4hHijIC8JUK5MEqeAWmParJrgBVzHLNfJDZxg4=", "owner": "nixos", "repo": "nixpkgs", - "rev": "80e4adbcf8992d3fd27ad4964fbb84907f9478b0", + "rev": "2d293cbfa5a793b4c50d17c05ef9e385b90edf6c", "type": "github" }, "original": { @@ -969,47 +1140,47 @@ }, "nixpkgs_7": { "locked": { - "lastModified": 1768393167, - "narHash": "sha256-n2063BRjHde6DqAz2zavhOOiLUwA3qXt7jQYHyETjX8=", - "owner": "NixOS", + "lastModified": 1768886240, + "narHash": "sha256-C2TjvwYZ2VDxYWeqvvJ5XPPp6U7H66zeJlRaErJKoEM=", + "owner": "nixos", "repo": "nixpkgs", - "rev": "2f594d5af95d4fdac67fba60376ec11e482041cb", + "rev": "80e4adbcf8992d3fd27ad4964fbb84907f9478b0", "type": "github" }, "original": { - "owner": "NixOS", - "ref": "nixpkgs-unstable", + "owner": "nixos", + "ref": "nixos-unstable", "repo": "nixpkgs", "type": "github" } }, "nixpkgs_8": { "locked": { - "lastModified": 1767767207, - "narHash": "sha256-Mj3d3PfwltLmukFal5i3fFt27L6NiKXdBezC1EBuZs4=", + "lastModified": 1768393167, + "narHash": "sha256-n2063BRjHde6DqAz2zavhOOiLUwA3qXt7jQYHyETjX8=", "owner": "NixOS", "repo": "nixpkgs", - "rev": "5912c1772a44e31bf1c63c0390b90501e5026886", + "rev": "2f594d5af95d4fdac67fba60376ec11e482041cb", "type": "github" }, "original": { "owner": "NixOS", - "ref": "nixos-unstable", + "ref": "nixpkgs-unstable", "repo": "nixpkgs", "type": "github" } }, "nixpkgs_9": { "locked": { - "lastModified": 1768127708, - "narHash": "sha256-1Sm77VfZh3mU0F5OqKABNLWxOuDeHIlcFjsXeeiPazs=", - "owner": "nixos", + "lastModified": 1767767207, + "narHash": "sha256-Mj3d3PfwltLmukFal5i3fFt27L6NiKXdBezC1EBuZs4=", + "owner": "NixOS", "repo": "nixpkgs", - "rev": "ffbc9f8cbaacfb331b6017d5a5abb21a492c9a38", + "rev": "5912c1772a44e31bf1c63c0390b90501e5026886", "type": "github" }, "original": { - "owner": "nixos", + "owner": "NixOS", "ref": "nixos-unstable", "repo": "nixpkgs", "type": "github" @@ -1042,7 +1213,7 @@ }, "opencode": { "inputs": { - "nixpkgs": "nixpkgs_7" + "nixpkgs": "nixpkgs_8" }, "locked": { "lastModified": 1769075413, @@ -1093,11 +1264,12 @@ "import-tree": "import-tree", "nix-flatpak": "nix-flatpak", "nix-snapd": "nix-snapd", + "nixos-fleet": "nixos-fleet", "nixos-hardware": "nixos-hardware", "nixos-secrets": "nixos-secrets", - "nixpkgs": "nixpkgs_6", + "nixpkgs": "nixpkgs_7", "opencode": "opencode", - "sops-nix": "sops-nix", + "sops-nix": "sops-nix_2", "stylix": "stylix", "zen-browser": "zen-browser" } @@ -1120,6 +1292,27 @@ } }, "sops-nix": { + "inputs": { + "nixpkgs": [ + "nixos-fleet", + "nixpkgs" + ] + }, + "locked": { + "lastModified": 1764483358, + "narHash": "sha256-EyyvCzXoHrbL467YSsQBTWWg4sR96MH1sPpKoSOelB4=", + "owner": "Mic92", + "repo": "sops-nix", + "rev": "5aca6ff67264321d47856a2ed183729271107c9c", + "type": "github" + }, + "original": { + "owner": "Mic92", + "repo": "sops-nix", + "type": "github" + } + }, + "sops-nix_2": { "inputs": { "nixpkgs": [ "nixpkgs" @@ -1139,6 +1332,22 @@ "type": "github" } }, + "stable": { + "locked": { + "lastModified": 1750133334, + "narHash": "sha256-urV51uWH7fVnhIvsZIELIYalMYsyr2FCalvlRTzqWRw=", + "owner": "NixOS", + "repo": "nixpkgs", + "rev": "36ab78dab7da2e4e27911007033713bab534187b", + "type": "github" + }, + "original": { + "owner": "NixOS", + "ref": "nixos-25.05", + "repo": "nixpkgs", + "type": "github" + } + }, "stylix": { "inputs": { "base16": "base16", @@ -1148,7 +1357,7 @@ "firefox-gnome-theme": "firefox-gnome-theme", "flake-parts": "flake-parts_3", "gnome-shell": "gnome-shell", - "nixpkgs": "nixpkgs_8", + "nixpkgs": "nixpkgs_9", "nur": "nur", "systems": "systems_4", "tinted-foot": "tinted-foot", @@ -1374,7 +1583,7 @@ "zen-browser": { "inputs": { "home-manager": "home-manager_2", - "nixpkgs": "nixpkgs_9" + "nixpkgs": "nixpkgs_10" }, "locked": { "lastModified": 1769059766, diff --git a/flake.nix b/flake.nix index b06df7e..4e17b87 100644 --- a/flake.nix +++ b/flake.nix @@ -25,6 +25,9 @@ opencode.url = "github:anomalyco/opencode"; devenv-bootstrap.url = "path:/home/syg/.config/nixos/archive/devenv-bootstrap"; import-tree.url = "github:vic/import-tree"; + + # Fleet management with Colmena + nixos-fleet.url = "path:/home/syg/Projects/open-source/nixos-fleet"; }; nixConfig = { @@ -73,6 +76,7 @@ nixd nixpkgs-fmt just + inputs'.nixos-fleet.packages.fleet ]; }; }; diff --git a/flake.nix.bak b/flake.nix.bak deleted file mode 100644 index 9b9627d..0000000 --- a/flake.nix.bak +++ /dev/null @@ -1,246 +0,0 @@ -{ - description = "Nixos config flake"; - - inputs = { - nix-snapd.url = "https://flakehub.com/f/io12/nix-snapd/0.1.47.tar.gz"; - nix-snapd.inputs.nixpkgs.follows = "nixpkgs"; - nixpkgs.url = "github:nixos/nixpkgs?shallow=1&ref=nixos-unstable"; - nixos-hardware.url = "github:NixOS/nixos-hardware/master"; - hyprland.url = "git+https://github.com/hyprwm/Hyprland?submodules=1"; - stylix.url = "github:danth/stylix"; - zen-browser.url = "github:0xc000022070/zen-browser-flake"; - nix-flatpak.url = "github:gmodena/nix-flatpak/?ref=v0.6.0"; - home-manager = { - url = "github:nix-community/home-manager"; - inputs.nixpkgs.follows = "nixpkgs"; - }; - fh.url = "https://flakehub.com/f/DeterminateSystems/fh/*.tar.gz"; - disko.url = "github:nix-community/disko"; - deploy-rs.url = "github:serokell/deploy-rs"; - colmena.url = "github:zhaofengli/colmena"; - colmena.inputs.nixpkgs.follows = "nixpkgs"; - sops-nix.url = "github:Mic92/sops-nix"; - sops-nix.inputs.nixpkgs.follows = "nixpkgs"; - nixos-secrets.url = "path:/home/syg/.config/nixos-secrets"; - git-hooks.url = "github:cachix/git-hooks.nix"; - nixos-secrets.flake = false; - devenv-bootstrap = { - url = "path:/home/syg/Projects/open-source/devenv-bootstrap"; - }; - }; - - nixConfig = { - # NOTE: To use Cachix for binary caching, set up a personal cache at https://cachix.org and add your cache URL and public key here. - # Example: - # extra-substituters = [ "https://your-cachix.cachix.org" ]; - # extra-trusted-public-keys = [ "your-cachix.cachix.org-1:..." ]; - # See https://docs.cachix.org for setup instructions. - # We'll revisit this later. - # builders = [ ]; # No remote builders configured - }; - - outputs = { self, nixpkgs, nixos-hardware, home-manager, fh, nix-snapd, nix-flatpak, colmena, git-hooks, ... } @ inputs: - let - inherit (nixpkgs) lib; - system = "x86_64-linux"; - - # List all systems here for easy extensibility - systems = { - orion = { - path = ./systems/orion; - modules = [ - inputs.stylix.nixosModules.stylix - nix-snapd.nixosModules.default - nixos-hardware.nixosModules.framework-13-7040-amd - home-manager.nixosModules.home-manager - inputs.sops-nix.nixosModules.sops - ]; - }; - cortex = { - path = ./systems/cortex; - modules = [ - inputs.disko.nixosModules.disko - home-manager.nixosModules.home-manager - inputs.sops-nix.nixosModules.sops - ]; - }; - nexus = { - path = ./systems/nexus; - modules = [ - inputs.disko.nixosModules.disko - home-manager.nixosModules.home-manager - inputs.sops-nix.nixosModules.sops - ]; - }; - axon = { - path = ./systems/axon; - modules = [ - inputs.stylix.nixosModules.stylix - home-manager.nixosModules.home-manager - ]; - }; - # Add new systems here! - }; - - # Import variables for home-manager (unchanged) - variables = import ./systems/orion/variables.nix; - inherit (variables.user) username; - userVars = variables.user; - systemVars = variables.system; - - inherit (nixpkgs.legacyPackages.${system}) pkgs; - - mkHomeConfiguration = variables: - home-manager.lib.homeManagerConfiguration { - inherit pkgs; - extraSpecialArgs = { - inherit self inputs userVars; - }; - modules = [ - nix-flatpak.homeManagerModules.nix-flatpak - ./systems/orion/homes/syg.nix - { - nixpkgs.config.allowUnfree = true; - } - ]; - }; - in - { - nixosConfigurations = lib.mapAttrs ( - name: cfg: - nixpkgs.lib.nixosSystem { - system = system; - modules = [ cfg.path ] ++ cfg.modules; - specialArgs = { - inherit self system inputs fh userVars; - # Secrets are now mandatory for all systems - hasSecrets = true; - # Default isTest to false for normal builds (true only in VM tests) - isTest = false; - }; - } - ) systems; - - homeConfigurations = { - ${username} = mkHomeConfiguration userVars; - "${username}@${systemVars.hostName}" = mkHomeConfiguration userVars; - }; - - # Provide Colmena from our flake input for deployment - packages.${system}.colmena = colmena.packages.${system}.colmena; - - # Expose local devenv-bootstrap package from the flake inputs - packages.${system}.devenv-bootstrap = inputs.devenv-bootstrap.packages.${system}.devenv-bootstrap; - - # Provide a flake-scoped package for the monitor-setup script so - # it can be included reliably in system packages via `self`. This - # avoids `builtins.readFile` fragility by packaging the script in - # the flake outputs during evaluation. - monitorSetup = pkgs.writeShellScriptBin "monitor-setup" (builtins.readFile ./scripts/desktop/monitor-setup.sh); - - # Add deploy-rs output for fleet management - deploy = { - sshUser = "jarvis"; # Global SSH user for all nodes - - nodes = { - cortex = { - hostname = "192.168.1.7"; # TODO: Switch to cortex.home when DNS is fixed - profiles.system = { - path = inputs.deploy-rs.lib.x86_64-linux.activate.nixos self.nixosConfigurations.cortex; - user = "root"; # Activate as root (via sudo) - }; - }; - nexus = { - hostname = "192.168.1.22"; # Nexus homelab services server - sshUser = "deploy"; # Deploy user with passwordless sudo - profiles.system = { - path = inputs.deploy-rs.lib.x86_64-linux.activate.nixos self.nixosConfigurations.nexus; - user = "root"; # Activate as root (via sudo) - }; - }; - axon = { - hostname = "192.168.1.11"; # TODO: Update with actual Axon IP - profiles.system = { - path = inputs.deploy-rs.lib.x86_64-linux.activate.nixos self.nixosConfigurations.axon; - user = "root"; # Activate as root (via sudo) - }; - }; - # Add other systems here as needed - }; - }; - - # Add deploy-rs checks - checks = builtins.mapAttrs (system: deployLib: deployLib.deployChecks self.deploy) inputs.deploy-rs.lib // { - secret-scan = git-hooks.lib.${system}.run { - src = ./.; - hooks = { - secret-scan = { - enable = true; - name = "secret-scan"; - description = "Narrow secret scanning (regex + detect-secrets + trufflehog)."; - entry = "${pkgs.bash}/bin/bash scripts/security/precommit-scan.sh"; - language = "system"; - pass_filenames = false; - stages = [ "pre-commit" ]; - files = ".*"; - }; - }; - }; - }; - - # Colmena deployment configuration (preferred over deploy-rs) - colmenaHive = colmena.lib.makeHive { - meta = { - nixpkgs = import nixpkgs { - inherit system; - config.allowUnfree = true; - }; - specialArgs = { - inherit self inputs fh userVars; - hasSecrets = true; - isTest = false; - }; - }; - - # Laptop - Framework 13 AMD - orion = { name, nodes, ... }: { - deployment = { - targetHost = "localhost"; # Local machine - targetUser = "syg"; - tags = [ "laptop" "local" "desktop" ]; - }; - imports = systems.orion.modules ++ [ systems.orion.path ]; - }; - - # Server - Homelab services - cortex = { name, nodes, ... }: { - deployment = { - targetHost = "192.168.1.7"; # TODO: Switch to cortex.home when DNS is fixed - targetUser = "jarvis"; - tags = [ "server" "homelab" ]; - }; - imports = systems.cortex.modules ++ [ systems.cortex.path ]; - }; - - # Server - NAS/Media - nexus = { name, nodes, ... }: { - deployment = { - targetHost = "192.168.1.22"; - targetUser = "deploy"; - tags = [ "server" "homelab" "nas" ]; - }; - imports = systems.nexus.modules ++ [ systems.nexus.path ]; - }; - - # HTPC - Living room entertainment - axon = { name, nodes, ... }: { - deployment = { - targetHost = "192.168.1.11"; # TODO: Update with actual Axon IP - targetUser = "jarvis"; - tags = [ "htpc" "desktop" ]; - }; - imports = systems.axon.modules ++ [ systems.axon.path ]; - }; - }; - }; -} diff --git a/fleet-config.nix b/fleet-config.nix index 9213f39..7c5bf6a 100644 --- a/fleet-config.nix +++ b/fleet-config.nix @@ -17,7 +17,7 @@ { # Global settings for all systems global = { - timeZone = "America/Los_Angeles"; # Pacific Time - change this to your timezone + timeZone = "America/Los_Angeles"; # Pacific Time - change this to your timezone locale = "en_US.UTF-8"; }; @@ -26,7 +26,25 @@ domain = "home"; subnet = "192.168.1.0/24"; gateway = "192.168.1.1"; - dns = [ "192.168.1.1" "1.1.1.1" ]; + dns = [ + "192.168.1.1" + "1.1.1.1" + ]; + }; + + # Shared infrastructure + infrastructure = { + # NAS/Storage configuration + nas = { + ip = "192.168.1.136"; + hostname = "synology"; + fqdn = "synology.home"; + shares = { + movies = "/volume1/Media/Movies"; + tvshows = "/volume1/Media/TV Shows"; + music = "/volume1/Media/Music"; + }; + }; }; # Host configurations @@ -35,20 +53,20 @@ orion = { hostname = "orion"; fqdn = "orion.home"; - ip = "192.168.1.30"; # DHCP reservation on UDM Pro - + ip = "192.168.1.30"; # DHCP reservation on UDM Pro + # Network interfaces interfaces = { wifi = { - name = "wlp1s0"; # May vary based on hardware - mac = null; # WiFi MAC not typically needed + name = "wlp1s0"; # May vary based on hardware + mac = null; # WiFi MAC not typically needed }; ethernet = { - name = "enp0s0"; # May vary - check with `ip link` - mac = null; # Update if you want WoL for orion + name = "enp0s0"; # May vary - check with `ip link` + mac = null; # Update if you want WoL for orion }; }; - + # SSH configuration ssh = { user = "syg"; @@ -56,17 +74,17 @@ keyPath = "~/.ssh/id_ed25519"; publicKey = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIMSdxXvx7Df+/2cPMe7C2TUSqRkYee5slatv7t3MG593 syg@nixos"; }; - + # Wake-on-LAN configuration wol = { - enabled = false; # Typically not needed for primary workstation + enabled = false; # Typically not needed for primary workstation interface = null; mac = null; }; - + # Deployment configuration deploy = { - enabled = false; # Orion is the control machine, not a deploy target + enabled = false; # Orion is the control machine, not a deploy target remoteBuild = false; }; }; @@ -75,16 +93,16 @@ cortex = { hostname = "cortex"; fqdn = "cortex.home"; - ip = "192.168.1.10"; # Static IP for main workstation - + ip = "192.168.1.7"; # Static IP - AI/ML server + # Network interfaces interfaces = { ethernet = { name = "enp3s0"; - mac = "9c:6b:00:35:51:55"; # Required for Wake-on-LAN + mac = "9c:6b:00:35:51:55"; # Required for Wake-on-LAN }; }; - + # SSH configuration ssh = { user = "jarvis"; @@ -95,18 +113,18 @@ "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIMSdxXvx7Df+/2cPMe7C2TUSqRkYee5slatv7t3MG593 syg@nixos" ]; }; - + # Wake-on-LAN configuration wol = { enabled = true; interface = "enp3s0"; mac = "9c:6b:00:35:51:55"; }; - + # Deployment configuration deploy = { enabled = true; - remoteBuild = true; # Build on remote to avoid signature issues + remoteBuild = true; # Build on remote to avoid signature issues method = "deploy-rs"; }; }; @@ -115,16 +133,16 @@ nexus = { hostname = "nexus"; fqdn = "nexus.home"; - ip = "192.168.1.22"; # Permanent static IP (DHCP reservation configured) - + ip = "192.168.1.22"; # Permanent static IP (DHCP reservation configured) + # Network interfaces interfaces = { ethernet = { - name = "eno1"; # Primary ethernet interface - mac = null; # Update with actual MAC for Wake-on-LAN if needed + name = "eno1"; # Primary ethernet interface + mac = null; # Update with actual MAC for Wake-on-LAN if needed }; }; - + # SSH configuration ssh = { user = "deploy"; @@ -134,18 +152,57 @@ "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIMSdxXvx7Df+/2cPMe7C2TUSqRkYee5slatv7t3MG593 syg@nixos" ]; }; - + # Wake-on-LAN configuration wol = { - enabled = false; # Enable if you want remote wake capability + enabled = false; # Enable if you want remote wake capability interface = null; mac = null; }; - + # Deployment configuration deploy = { enabled = true; - remoteBuild = false; # Build locally, deploy remotely + remoteBuild = false; # Build locally, deploy remotely + method = "deploy-rs"; + }; + }; + + # Axon - Media Center / HTPC + axon = { + hostname = "axon"; + fqdn = "axon.home"; + ip = "192.168.1.25"; # Media center system + + # Network interfaces + interfaces = { + ethernet = { + name = "enp0s0"; + mac = null; # Update if you want WoL + }; + }; + + # SSH configuration + ssh = { + user = "axon"; + port = 22; + keyPath = "~/.ssh/id_ed25519"; + authorizedKeys = [ + "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIMSdxXvx7Df+/2cPMe7C2TUSqRkYee5slatv7t3MG593 syg@nixos" + ]; + }; + + # Wake-on-LAN configuration + wol = { + enabled = false; + interface = null; + mac = null; + }; + + # Deployment configuration + deploy = { + enabled = false; # Local system, not a remote deploy target + remoteBuild = false; method = "deploy-rs"; }; }; @@ -157,14 +214,14 @@ # hostname = "hostname"; # fqdn = "hostname.home"; # ip = "192.168.1.x"; - # + # # interfaces = { # ethernet = { # name = "enp0s0"; # mac = "xx:xx:xx:xx:xx:xx"; # }; # }; - # + # # ssh = { # user = "username"; # port = 22; @@ -172,13 +229,13 @@ # publicKey = "ssh-ed25519 ..."; # authorizedKeys = [ ]; # }; - # + # # wol = { # enabled = false; # interface = null; # mac = null; # }; - # + # # deploy = { # enabled = false; # remoteBuild = false; diff --git a/lib/network.nix b/lib/network.nix index d2dde2a..a4fc619 100644 --- a/lib/network.nix +++ b/lib/network.nix @@ -1,74 +1,101 @@ # Network configuration helper library -# Provides convenient functions to access network-config.nix throughout your flake -{ lib ? (import {}).lib }: +# Provides convenient functions to access fleet-config.nix throughout your flake +{ + lib ? (import { }).lib, +}: let inherit (lib) mapAttrs attrNames; - + # Import the network configuration - networkConfig = import ../network-config.nix; - + networkConfig = import ../fleet-config.nix; + # Helper function to get host config by name # Usage: getHost "cortex" - getHost = hostName: - networkConfig.hosts.${hostName} or (throw "Host '${hostName}' not found in network-config.nix"); - + getHost = + hostName: + networkConfig.hosts.${hostName} or (throw "Host '${hostName}' not found in fleet-config.nix"); + # Helper function to get SSH connection string # Usage: getSshTarget "cortex" => "jarvis@192.168.1.7" - getSshTarget = hostName: - let host = getHost hostName; - in "${host.ssh.user}@${host.ip}"; - + getSshTarget = + hostName: + let + host = getHost hostName; + in + "${host.ssh.user}@${host.ip}"; + # Helper function to get full SSH command # Usage: getSshCommand "cortex" => "ssh jarvis@192.168.1.7" - getSshCommand = hostName: - let host = getHost hostName; - in "ssh ${host.ssh.user}@${host.ip}"; - + getSshCommand = + hostName: + let + host = getHost hostName; + in + "ssh ${host.ssh.user}@${host.ip}"; + # Helper function to check if host has WoL enabled # Usage: hasWol "cortex" => true - hasWol = hostName: - let host = getHost hostName; - in host.wol.enabled or false; - + hasWol = + hostName: + let + host = getHost hostName; + in + host.wol.enabled or false; + # Helper function to get WoL MAC address # Usage: getWolMac "cortex" => "9c:6b:00:35:51:55" - getWolMac = hostName: - let host = getHost hostName; - in if hasWol hostName - then host.wol.mac - else throw "Host '${hostName}' does not have Wake-on-LAN enabled"; - + getWolMac = + hostName: + let + host = getHost hostName; + in + if hasWol hostName then + host.wol.mac + else + throw "Host '${hostName}' does not have Wake-on-LAN enabled"; + # Helper function to check if host is deployable # Usage: isDeployable "cortex" => true - isDeployable = hostName: - let host = getHost hostName; - in host.deploy.enabled or false; - + isDeployable = + hostName: + let + host = getHost hostName; + in + host.deploy.enabled or false; + # Helper function to get deploy configuration # Usage: getDeployConfig "cortex" - getDeployConfig = hostName: - let host = getHost hostName; - in if isDeployable hostName - then host.deploy - else throw "Host '${hostName}' is not configured for deployment"; - + getDeployConfig = + hostName: + let + host = getHost hostName; + in + if isDeployable hostName then + host.deploy + else + throw "Host '${hostName}' is not configured for deployment"; + # Get list of all deployable hosts # Usage: getDeployableHosts => [ "cortex" ] - getDeployableHosts = - builtins.filter (name: (getHost name).deploy.enabled or false) (attrNames networkConfig.hosts); - + getDeployableHosts = builtins.filter (name: (getHost name).deploy.enabled or false) ( + attrNames networkConfig.hosts + ); + # Get list of all hosts with WoL enabled # Usage: getWolHosts => [ "cortex" ] - getWolHosts = - builtins.filter (name: (getHost name).wol.enabled or false) (attrNames networkConfig.hosts); - + getWolHosts = builtins.filter (name: (getHost name).wol.enabled or false) ( + attrNames networkConfig.hosts + ); + # Generate deploy-rs node configuration for a host # Usage: mkDeployNode "cortex" - mkDeployNode = hostName: activatePath: - let + mkDeployNode = + hostName: activatePath: + let host = getHost hostName; deployConfig = getDeployConfig hostName; - in { + in + { hostname = host.ip; profiles.system = { path = activatePath; @@ -81,14 +108,14 @@ in { # Export the raw config inherit networkConfig; - + # Export helper functions inherit getHost; inherit getSshTarget getSshCommand; inherit hasWol getWolMac getWolHosts; inherit isDeployable getDeployConfig getDeployableHosts; inherit mkDeployNode; - + # Convenience exports network = networkConfig.network; hosts = networkConfig.hosts; diff --git a/modules/system/base/default.nix b/modules/system/base/default.nix index 8ace79e..7f339b3 100644 --- a/modules/system/base/default.nix +++ b/modules/system/base/default.nix @@ -4,6 +4,9 @@ lib, ... }: +let + fleetConfig = import ../../../fleet-config.nix; +in { imports = [ ./networking.nix # Fleet-wide hosts configuration @@ -11,6 +14,22 @@ # Base NixOS configuration - essential settings for any system + # Localization - use fleet-config for consistency + i18n = { + defaultLocale = fleetConfig.global.locale; + extraLocaleSettings = { + LC_ADDRESS = fleetConfig.global.locale; + LC_IDENTIFICATION = fleetConfig.global.locale; + LC_MEASUREMENT = fleetConfig.global.locale; + LC_MONETARY = fleetConfig.global.locale; + LC_NAME = fleetConfig.global.locale; + LC_NUMERIC = fleetConfig.global.locale; + LC_PAPER = fleetConfig.global.locale; + LC_TELEPHONE = fleetConfig.global.locale; + LC_TIME = fleetConfig.global.locale; + }; + }; + # Essential boot configuration boot = { loader = { diff --git a/modules/system/kanboard.nix b/modules/system/kanboard.nix deleted file mode 100644 index 4629db7..0000000 --- a/modules/system/kanboard.nix +++ /dev/null @@ -1,12 +0,0 @@ -{ config, pkgs, ... }: - -{ - # Kanboard - Simple Kanban board, one service, that's it - services.kanboard = { - enable = true; - domain = "localhost"; - }; - - # That's literally it. Just visit http://localhost:9000 - # Default login: admin / admin -} \ No newline at end of file diff --git a/modules/system/locale.nix b/modules/system/locale.nix deleted file mode 100644 index a1bcf51..0000000 --- a/modules/system/locale.nix +++ /dev/null @@ -1,20 +0,0 @@ -_: -let - locale = "en_US.UTF-8"; -in -{ - i18n = { - defaultLocale = locale; - extraLocaleSettings = { - LC_ADDRESS = locale; - LC_IDENTIFICATION = locale; - LC_MEASUREMENT = locale; - LC_MONETARY = locale; - LC_NAME = locale; - LC_NUMERIC = locale; - LC_PAPER = locale; - LC_TELEPHONE = locale; - LC_TIME = locale; - }; - }; -} diff --git a/modules/system/system/secrets-password-sync.nix b/modules/system/system/secrets-password-sync.nix deleted file mode 100644 index 54f48cf..0000000 --- a/modules/system/system/secrets-password-sync.nix +++ /dev/null @@ -1,46 +0,0 @@ -# Auto-sync user passwords from SOPS secrets files -# This module ensures passwords are updated on every activation, not just user creation -{ config, lib, pkgs, ... }: - -with lib; - -let - cfg = config.modules.system.secrets-password-sync; -in -{ - options.modules.system.secrets-password-sync = { - enable = mkEnableOption "automatic password synchronization from SOPS secrets"; - - users = mkOption { - type = types.attrsOf (types.submodule { - options = { - passwordSecretPath = mkOption { - type = types.str; - description = "Path to the SOPS secret containing the password hash"; - example = "config.sops.secrets.\"hostname/username_password_hash\".path"; - }; - }; - }); - default = {}; - description = "Users whose passwords should be auto-synced from secrets"; - example = literalExpression '' - { - rescue = { - passwordSecretPath = config.sops.secrets."nexus/rescue_password_hash".path; - }; - } - ''; - }; - }; - - config = mkIf cfg.enable { - system.activationScripts.syncPasswordsFromSecrets = stringAfter [ "users" ] '' - ${concatStringsSep "\n" (mapAttrsToList (username: userCfg: '' - if [ -f "${userCfg.passwordSecretPath}" ]; then - NEW_HASH=$(cat "${userCfg.passwordSecretPath}") - echo "${username}:$NEW_HASH" | ${pkgs.shadow}/bin/chpasswd --encrypted - fi - '') cfg.users)} - ''; - }; -} diff --git a/monitors.json b/monitors.json deleted file mode 100644 index 1739f26..0000000 --- a/monitors.json +++ /dev/null @@ -1,5 +0,0 @@ -{ - "0x0BCA": "2256x1504@60, 0x0, 1", - "Sceptre M24": "1920x1080@165, 3440x1267, 1, transform, 1", - "ED343CUR V": "preferred, 0x1504, 1" -} diff --git a/network-config.nix b/network-config.nix deleted file mode 100644 index 482dcba..0000000 --- a/network-config.nix +++ /dev/null @@ -1,142 +0,0 @@ -# Network Configuration -# Single source of truth for all network hosts and infrastructure -# -# This file centralizes network configuration for: -# - IP addresses and hostnames -# - MAC addresses for Wake-on-LAN -# - SSH configuration -# - Deployment settings -# -# Benefits: -# - DRY: Define once, use everywhere -# - Type-safe: Nix ensures consistency -# - Easy updates: Change in one place -# - Scriptable: Generate scripts from this config -{ - # Network-wide settings - network = { - domain = "home"; - subnet = "192.168.1.0/24"; - gateway = "192.168.1.1"; - dns = [ "192.168.1.1" "1.1.1.1" ]; - }; - - # Host configurations - hosts = { - # Orion - Primary laptop/workstation - orion = { - hostname = "orion"; - fqdn = "orion.home"; - ip = "192.168.1.100"; # Update with actual IP if static - - # Network interfaces - interfaces = { - wifi = { - name = "wlp1s0"; # May vary based on hardware - mac = null; # WiFi MAC not typically needed - }; - ethernet = { - name = "enp0s0"; # May vary - check with `ip link` - mac = null; # Update if you want WoL for orion - }; - }; - - # SSH configuration - ssh = { - user = "syg"; - port = 22; - keyPath = "~/.ssh/id_ed25519"; - publicKey = "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIMSdxXvx7Df+/2cPMe7C2TUSqRkYee5slatv7t3MG593 syg@nixos"; - }; - - # Wake-on-LAN configuration - wol = { - enabled = false; # Typically not needed for primary workstation - interface = null; - mac = null; - }; - - # Deployment configuration - deploy = { - enabled = false; # Orion is the control machine, not a deploy target - remoteBuild = false; - }; - }; - - # Cortex - AI/ML server - cortex = { - hostname = "cortex"; - fqdn = "cortex.home"; - ip = "192.168.1.7"; - - # Network interfaces - interfaces = { - ethernet = { - name = "enp3s0"; - mac = "9c:6b:00:35:51:55"; # Required for Wake-on-LAN - }; - }; - - # SSH configuration - ssh = { - user = "jarvis"; - port = 22; - keyPath = "~/.ssh/id_ed25519"; - # Authorized keys that can access this host - authorizedKeys = [ - "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIMSdxXvx7Df+/2cPMe7C2TUSqRkYee5slatv7t3MG593 syg@nixos" - ]; - }; - - # Wake-on-LAN configuration - wol = { - enabled = true; - interface = "enp3s0"; - mac = "9c:6b:00:35:51:55"; - }; - - # Deployment configuration - deploy = { - enabled = true; - remoteBuild = true; # Build on remote to avoid signature issues - method = "deploy-rs"; - }; - }; - - # Template for adding new hosts - # Copy this block and customize for each new machine - # - # hostname-here = { - # hostname = "hostname"; - # fqdn = "hostname.home"; - # ip = "192.168.1.x"; - # - # interfaces = { - # ethernet = { - # name = "enp0s0"; - # mac = "xx:xx:xx:xx:xx:xx"; - # }; - # }; - # - # ssh = { - # user = "username"; - # port = 22; - # keyPath = "~/.ssh/id_ed25519"; - # publicKey = "ssh-ed25519 ..."; - # authorizedKeys = [ ]; - # }; - # - # wol = { - # enabled = false; - # interface = null; - # mac = null; - # }; - # - # deploy = { - # enabled = false; - # remoteBuild = false; - # method = "deploy-rs"; - # }; - # }; - }; -} diff --git a/claude-god-mode.txt b/prompts/claude-god-mode.txt similarity index 100% rename from claude-god-mode.txt rename to prompts/claude-god-mode.txt diff --git a/scripts/deployment/archive/check-system.sh.archived b/scripts/deployment/archive/check-system.sh.archived deleted file mode 100755 index be3af55..0000000 --- a/scripts/deployment/archive/check-system.sh.archived +++ /dev/null @@ -1,236 +0,0 @@ -#!/usr/bin/env bash -# NixOS System Health Check -# Performs comprehensive SSH connectivity and service health checks -# Can be used standalone or integrated into deployment workflows - -set -euo pipefail - -# Colors -RED='\033[0;31m' -GREEN='\033[0;32m' -YELLOW='\033[1;33m' -BLUE='\033[0;34m' -NC='\033[0m' - -info() { echo -e "${BLUE}β„Ή $*${NC}"; } -success() { echo -e "${GREEN}βœ“ $*${NC}"; } -warn() { echo -e "${YELLOW}⚠ $*${NC}"; } -error() { echo -e "${RED}βœ— $*${NC}" >&2; } - -usage() { - cat < [ssh-user] [flake-name] - -Arguments: - hostname - IP address or hostname to check - ssh-user - SSH user (default: root) - flake-name - Expected hostname from flake (default: hostname) - -Examples: - $0 192.168.1.7 - $0 cortex.home jarvis cortex - $0 192.168.1.22 deploy nexus - -Checks Performed: - 1. SSH key availability - 2. Network connectivity (ping) - 3. SSH port accessibility - 4. SSH authentication - 5. System information - 6. Hostname/user verification - 7. Security services (fail2ban, auditd, sshd) - 8. Configuration file validation - 9. Security binary availability - 10. NixOS generation info - 11. Systemd service links - 12. Recommendations - -Exit Codes: - 0 - All checks passed - 1 - One or more checks failed -EOF -} - -check_system() { - local hostname="${1:-}" - local ssh_user="${2:-root}" - local expected_hostname="${3:-$hostname}" - local timeout=10 - local failed=0 - - if [ -z "$hostname" ]; then - error "Hostname is required" - usage - exit 1 - fi - - info "Checking $hostname (expecting: $expected_hostname) as $ssh_user..." - echo "" - - # Step 1: SSH key check - info "Step 1: Checking SSH keys..." - if ssh-add -l &>/dev/null; then - LOADED_KEYS=$(ssh-add -l | wc -l) - success "SSH agent has $LOADED_KEYS key(s) loaded" - else - warn "SSH agent not running or no keys loaded" - info "Try: ssh-add ~/.ssh/id_ed25519" - fi - echo "" - - # Step 2: Ping - info "Step 2: Testing network connectivity to $hostname..." - if ping -c 1 -W "$timeout" "$hostname" &>/dev/null; then - success "Host $hostname is reachable" - else - error "Cannot reach $hostname" - failed=1 - fi - echo "" - - # Step 3: SSH port - info "Step 3: Checking if SSH port is open..." - if timeout 5 bash -c "cat < /dev/null > /dev/tcp/$hostname/22" 2>/dev/null; then - success "SSH port (22) is open on $hostname" - else - error "SSH port (22) is not accessible" - failed=1 - fi - echo "" - - # Step 4: SSH connection - info "Step 4: Testing SSH connection..." - if ssh -o ConnectTimeout="$timeout" -o BatchMode=yes "$ssh_user@$hostname" "echo 'SSH connection successful'" &>/dev/null; then - success "SSH connection successful!" - else - warn "SSH connection failed" - info "Trying verbose SSH connection for debugging..." - ssh -v -o ConnectTimeout="$timeout" "$ssh_user@$hostname" "exit" 2>&1 | tail -20 - failed=1 - fi - echo "" - - # If SSH failed, no point continuing - if [ $failed -eq 1 ]; then - error "Cannot establish SSH connection. Stopping checks." - exit 1 - fi - - # Step 5: System info - info "Step 5: Gathering system info..." - ssh "$ssh_user@$hostname" "uname -a && uptime" - echo "" - - # Step 6: Hostname/user verification - info "Step 6: Verifying system identity..." - ACTUAL_HOSTNAME=$(ssh -o ConnectTimeout="$timeout" "$ssh_user@$hostname" "hostname" 2>/dev/null || echo "unknown") - if [ "$ACTUAL_HOSTNAME" = "$expected_hostname" ]; then - success "Hostname confirmed: $expected_hostname" - else - warn "Hostname is '$ACTUAL_HOSTNAME' (expected '$expected_hostname')" - fi - - USER_CHECK=$(ssh -o ConnectTimeout="$timeout" "$ssh_user@$hostname" "whoami" 2>/dev/null || echo "unknown") - if [ "$USER_CHECK" = "$ssh_user" ]; then - success "User confirmed: $ssh_user" - else - warn "User is '$USER_CHECK' (expected '$ssh_user')" - fi - echo "" - - # Step 7: Security service checks - info "Step 7: Checking security services..." - for service in fail2ban auditd sshd; do - if ssh "$ssh_user@$hostname" "systemctl is-active $service" &>/dev/null; then - success "$service is running" - else - warn "$service is not running or not accessible" - fi - done - echo "" - - # Step 8: Config file validation - info "Step 8: Checking config files..." - if ssh "$ssh_user@$hostname" "test -f /etc/fail2ban/jail.local" 2>/dev/null; then - success "fail2ban jail.local exists" - if ssh "$ssh_user@$hostname" "fail2ban-client -t" &>/dev/null; then - success "fail2ban configuration is valid" - else - warn "fail2ban configuration has errors (or needs sudo)" - fi - else - warn "fail2ban jail.local not found (or needs sudo to check)" - fi - - if ssh "$ssh_user@$hostname" "test -f /etc/audit/auditd.conf" 2>/dev/null; then - success "auditd.conf exists" - else - warn "auditd.conf not found (or needs sudo to check)" - fi - echo "" - - # Step 9: Binary/package presence - info "Step 9: Checking security binaries..." - for binary in fail2ban-server fail2ban-client auditd auditctl; do - if ssh "$ssh_user@$hostname" "which $binary" &>/dev/null; then - success "$binary is available" - else - warn "$binary is not available" - fi - done - echo "" - - # Step 10: NixOS generation check - info "Step 10: Checking NixOS configuration generation..." - CURRENT_GEN=$(ssh "$ssh_user@$hostname" "readlink /nix/var/nix/profiles/system" 2>/dev/null || echo "unknown") - if [ "$CURRENT_GEN" != "unknown" ]; then - success "Current generation: $CURRENT_GEN" - else - warn "Could not determine NixOS generation" - fi - echo "" - - # Step 11: Systemd service configuration check - info "Step 11: Checking systemd service links..." - for service in fail2ban auditd sshd; do - if ssh "$ssh_user@$hostname" "test -L /etc/systemd/system/multi-user.target.wants/$service.service" 2>/dev/null; then - success "$service.service is linked in systemd target" - else - warn "$service.service not found in systemd multi-user.target.wants" - fi - done - echo "" - - # Step 12: Recommendations - info "Step 12: Summary and Recommendations" - if ssh "$ssh_user@$hostname" "systemctl is-active fail2ban auditd sshd" &>/dev/null; then - success "All security services appear to be running correctly!" - else - warn "Some services may not be running. Recommended actions:" - echo " 1. Check service status: ssh $ssh_user@$hostname 'systemctl status fail2ban auditd sshd'" - echo " 2. Check system logs: ssh $ssh_user@$hostname 'sudo journalctl -b'" - echo " 3. Redeploy configuration if services are expected to be running" - fi - - if [ $failed -eq 0 ]; then - echo "" - success "Health check completed successfully!" - return 0 - else - echo "" - error "Health check completed with failures" - return 1 - fi -} - -# Main execution -if [ "${BASH_SOURCE[0]}" = "${0}" ]; then - if [ $# -eq 0 ] || [ "$1" = "-h" ] || [ "$1" = "--help" ]; then - usage - exit 0 - fi - - check_system "$@" -fi diff --git a/scripts/deployment/archive/fleet.sh.archived b/scripts/deployment/archive/fleet.sh.archived deleted file mode 100755 index a88ab10..0000000 --- a/scripts/deployment/archive/fleet.sh.archived +++ /dev/null @@ -1,623 +0,0 @@ -#!/usr/bin/env bash -# Unified NixOS Fleet Management Script -# Auto-discovers systems from flake.nix, supports deploy/update/build/check/list -# shellcheck disable=SC2317 # Don't warn about unreachable commands (error() exits) - -set -euo pipefail - -# Colors -RED='\033[0;31m' -GREEN='\033[0;32m' -YELLOW='\033[1;33m' -BLUE='\033[0;34m' -NC='\033[0m' - -info() { echo -e "${BLUE}β„Ή $*${NC}"; } -success() { echo -e "${GREEN}βœ“ $*${NC}"; } -warn() { echo -e "${YELLOW}⚠ $*${NC}"; } -error() { echo -e "${RED}βœ— $*${NC}"; log_operation "ERROR" "$*"; exit 1; } - -# === Logging === - -LOG_FILE="${HOME}/.nixos-fleet.log" - -log_operation() { - local level="$1" - shift - local message="$*" - local timestamp - timestamp=$(date '+%Y-%m-%d %H:%M:%S') - echo "[$timestamp] [$level] $message" >> "$LOG_FILE" -} - -FLAKE_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)" -SECRETS_MANAGER="$FLAKE_DIR/scripts/secrets-manager.sh" - -# Auto-discover systems from flake.nix - -get_systems() { - nix eval "$FLAKE_DIR#nixosConfigurations" --apply 'builtins.attrNames' --json 2>/dev/null | jq -r '.[]' -} - -# Load host variables from Nix config - -get_host_vars() { - local system="$1" - # Try to load from Nix config first, fallback to reading variables.nix directly - if nix eval --json "$FLAKE_DIR#nixosConfigurations.$system.config.variables" 2>/dev/null; then - return 0 - fi - - # Fallback: read variables.nix directly and convert to JSON - local vars_file="$FLAKE_DIR/systems/$system/variables.nix" - if [[ -f "$vars_file" ]]; then - nix-instantiate --eval --strict --json -E "import $vars_file" 2>/dev/null || echo "{}" - else - echo "{}" - fi -} - -usage() { - echo -e "${GREEN}Fleet Management Tool${NC}" - echo "Usage: $0 [options]" - echo "Commands:" - echo " list List all systems" - echo " build Build system config locally" - echo " check Check connection and health" - echo " deploy Initial deployment (nixos-anywhere)" - echo " update Update system (colmena)" - echo " update-all Update all systems in parallel" - echo " update-tag @ Update systems with specific tag" - echo "" - echo " ISO Management:" - echo " iso build Build custom live ISO with SSH key" - echo " iso flash Flash ISO to USB device (e.g., /dev/sda)" - echo " iso list-devices List available USB devices" - echo " iso path Show path to built ISO" - echo "" - echo " Secrets Management:" - echo " secrets Manage secrets (see: $0 secrets help)" - echo " secrets edit Edit secrets file" - echo " secrets view View secrets (read-only)" - echo " secrets cat Print decrypted secrets" - echo " secrets validate Test encryption/decryption" - echo " secrets add-host Add password for system" - echo " secrets rotate Rotate password for system" - echo " secrets status Show secrets file status" - echo " secrets config Show configuration" - echo " secrets recipients List who can decrypt" - echo "" - echo "Options:" - echo " --validate-secrets Validate secrets before deployment" - echo "" - echo "Tags:" - echo " @laptop - Laptop systems (orion)" - echo " @server - Server systems (cortex, nexus)" - echo " @homelab - Homelab servers (cortex, nexus)" - echo " @desktop - Desktop/laptop systems (orion, axon)" - echo " @htpc - HTPC systems (axon)" -} - -list_systems() { - info "Available systems:" - get_systems | awk '{print " - "$1}' -} - -build_system() { - local system="${1:-}" - if [ -z "$system" ]; then - error "Usage: $0 build " - fi - info "Building $system configuration..." - nix build "$FLAKE_DIR#nixosConfigurations.$system.config.system.build.toplevel" --show-trace && success "Build successful for $system" -} - -check_system() { - local system="${1:-}" - local timeout=10 - - if [ -z "$system" ]; then - error "Usage: $0 check " - fi - - # Load host variables - local VARS - VARS=$(get_host_vars "$system") - local hostname - local ssh_user - hostname=$(echo "$VARS" | jq -r '.network.ip // .network.hostname // ""') - ssh_user=$(echo "$VARS" | jq -r '.network.ssh.user // .user.username // "root"') - - if [ -z "$hostname" ]; then - error "Could not determine hostname or IP for $system from config" - fi - - info "Checking $system ($hostname) as $ssh_user..." - - # Step 1: SSH key check - info "Step 1: Checking SSH keys..." - if ssh-add -l &>/dev/null; then - LOADED_KEYS=$(ssh-add -l | wc -l) - success "SSH agent has $LOADED_KEYS key(s) loaded" - else - warn "SSH agent not running or no keys loaded" - info "Try: ssh-add ~/.ssh/id_ed25519" - fi - - # Step 2: Ping - info "Step 2: Testing network connectivity to $hostname..." - if ping -c 1 -W "$timeout" "$hostname" &>/dev/null; then - success "Host $hostname is reachable" - else - error "Cannot reach $hostname" - return 1 - fi - - # Step 3: SSH port - info "Step 3: Checking if SSH port is open..." - if timeout 5 bash -c "cat < /dev/null > /dev/tcp/$hostname/22" 2>/dev/null; then - success "SSH port (22) is open on $hostname" - else - error "SSH port (22) is not accessible" - return 1 - fi - - # Step 4: SSH connection - info "Step 4: Testing SSH connection..." - if ssh -o ConnectTimeout="$timeout" -o BatchMode=yes "$ssh_user@$hostname" "echo 'SSH connection successful'" &>/dev/null; then - success "SSH connection successful!" - else - warn "SSH connection failed" - info "Trying verbose SSH connection for debugging..." - ssh -v -o ConnectTimeout="$timeout" "$ssh_user@$hostname" "exit" 2>&1 | tail -20 - return 1 - fi - - # Step 5: System info - info "Step 5: Gathering system info..." - ssh "$ssh_user@$hostname" "uname -a && uptime" - - # Step 6: Hostname/user verification - info "Step 6: Verifying system identity..." - HOSTNAME=$(ssh -o ConnectTimeout="$timeout" "$ssh_user@$hostname" "hostname" 2>/dev/null || echo "unknown") - if [ "$HOSTNAME" = "$system" ]; then - success "Hostname confirmed: $system" - else - warn "Hostname is '$HOSTNAME' (expected '$system')" - fi - USER_CHECK=$(ssh -o ConnectTimeout="$timeout" "$ssh_user@$hostname" "whoami" 2>/dev/null || echo "unknown") - if [ "$USER_CHECK" = "$ssh_user" ]; then - success "User confirmed: $ssh_user" - else - warn "User is '$USER_CHECK' (expected '$ssh_user')" - fi - - # Step 7: Security service checks - info "Step 7: Checking security services..." - # shellcheck disable=SC2029 # Variable expansion is intentional - for service in fail2ban auditd sshd; do - if ssh "$ssh_user@$hostname" "systemctl is-active $service" &>/dev/null; then - success "$service is running" - else - warn "$service is not running or not accessible" - fi - done - - # Step 8: Config file validation - info "Step 8: Checking config files..." - if ssh "$ssh_user@$hostname" "test -f /etc/fail2ban/jail.local" 2>/dev/null; then - success "fail2ban jail.local exists" - if ssh "$ssh_user@$hostname" "fail2ban-client -t" &>/dev/null; then - success "fail2ban configuration is valid" - else - warn "fail2ban configuration has errors (or needs sudo)" - fi - else - warn "fail2ban jail.local not found (or needs sudo to check)" - fi - if ssh "$ssh_user@$hostname" "test -f /etc/audit/auditd.conf" 2>/dev/null; then - success "auditd.conf exists" - else - warn "auditd.conf not found (or needs sudo to check)" - fi - - # Step 9: Binary/package presence - info "Step 9: Checking security binaries..." - # shellcheck disable=SC2029 # Variable expansion is intentional - for binary in fail2ban-server fail2ban-client auditd auditctl; do - if ssh "$ssh_user@$hostname" "which $binary" &>/dev/null; then - success "$binary is available" - else - warn "$binary is not available" - fi - done - - # Step 10: NixOS generation check - info "Step 10: Checking NixOS configuration generation..." - CURRENT_GEN=$(ssh "$ssh_user@$hostname" "readlink /nix/var/nix/profiles/system" 2>/dev/null || echo "unknown") - if [ "$CURRENT_GEN" != "unknown" ]; then - success "Current generation: $CURRENT_GEN" - else - warn "Could not determine NixOS generation" - fi - - # Step 11: Systemd service configuration check - info "Step 11: Checking systemd service links..." - # shellcheck disable=SC2029 # Variable expansion is intentional - for service in fail2ban auditd sshd; do - if ssh "$ssh_user@$hostname" "test -L /etc/systemd/system/multi-user.target.wants/$service.service" 2>/dev/null; then - success "$service.service is linked in systemd target" - else - warn "$service.service not found in systemd multi-user.target.wants" - fi - done - - # Step 12: Recommendations - info "Step 12: Recommendations" - if ssh "$ssh_user@$hostname" "systemctl is-active fail2ban auditd sshd" &>/dev/null; then - success "All security services appear to be running correctly!" - else - warn "Some services may not be running. Recommended actions:" - echo "1. Redeploy the configuration: ./fleet.sh update $system" - echo "2. Check NixOS configuration syntax: nix build .#nixosConfigurations.$system.config.system.build.toplevel" - echo "3. Manually rebuild on the target: ssh -t $ssh_user@$hostname 'sudo nixos-rebuild switch --flake /etc/nixos#$system'" - echo "4. Check the full system logs: ssh -t $ssh_user@$hostname 'sudo journalctl -b'" - fi -} - -# Secrets Management Integration - -ISO_FLAKE_DIR="$FLAKE_DIR/systems/custom-live-iso" -ISO_RESULT_PATH="$ISO_FLAKE_DIR/result/iso/nixos-homelab-installer.iso" - -iso_command() { - local subcmd="${1:-}" - shift || true - - case "$subcmd" in - build) - iso_build - ;; - flash) - iso_flash "${1:-}" - ;; - list-devices) - iso_list_devices - ;; - path) - iso_path - ;; - *) - error "Unknown iso subcommand: $subcmd" - echo "Available subcommands: build, flash, list-devices, path" - exit 1 - ;; - esac -} - -iso_build() { - info "Building custom live ISO with SSH key..." - log_operation "INFO" "Building live ISO" - - if [ ! -d "$ISO_FLAKE_DIR" ]; then - error "ISO flake directory not found: $ISO_FLAKE_DIR" - fi - - cd "$ISO_FLAKE_DIR" - - info "This will take a few minutes (downloading packages + building)..." - if nix build ".#nixosConfigurations.installer.config.system.build.isoImage"; then - success "ISO built successfully!" - if [ -f "$ISO_RESULT_PATH" ]; then - local iso_size - iso_size=$(du -h "$ISO_RESULT_PATH" | cut -f1) - success "ISO location: $ISO_RESULT_PATH" - success "ISO size: $iso_size" - info "Next steps:" - echo " 1. List USB devices: $0 iso list-devices" - echo " 2. Flash to USB: $0 iso flash /dev/sdX" - else - warn "ISO built but not found at expected location" - fi - else - error "ISO build failed" - fi - - cd "$FLAKE_DIR" -} - -iso_flash() { - local device="${1:-}" - - if [ -z "$device" ]; then - error "Usage: $0 iso flash " - info "Example: $0 iso flash /dev/sda" - info "Tip: Run '$0 iso list-devices' to see available devices" - exit 1 - fi - - if [ ! -f "$ISO_RESULT_PATH" ]; then - error "ISO not found at: $ISO_RESULT_PATH" - info "Build it first with: $0 iso build" - exit 1 - fi - - if [ ! -b "$device" ]; then - error "Device not found or not a block device: $device" - info "Run '$0 iso list-devices' to see available devices" - exit 1 - fi - - # Safety checks - if [[ "$device" == *"nvme"* ]] || [[ "$device" == *"mmcblk"* ]]; then - warn "WARNING: $device looks like an internal drive!" - warn "Are you SURE this is your USB device?" - fi - - local iso_size - iso_size=$(du -h "$ISO_RESULT_PATH" | cut -f1) - - warn "DESTRUCTIVE OPERATION - THIS WILL ERASE $device!" - info "ISO: $ISO_RESULT_PATH ($iso_size)" - info "Target: $device" - read -r -p "Type 'yes' to flash the ISO: " confirm - - if [[ "$confirm" != "yes" ]]; then - info "Cancelled" - exit 0 - fi - - info "Flashing ISO to $device..." - log_operation "INFO" "Flashing ISO to $device" - - if sudo dd if="$ISO_RESULT_PATH" of="$device" bs=4M status=progress conv=fsync; then - success "ISO flashed successfully to $device!" - info "Syncing filesystem..." - sync - success "Done! You can now safely remove the USB drive" - info "Next steps:" - echo " 1. Insert USB into target system" - echo " 2. Boot from USB" - echo " 3. Deploy: $0 deploy " - else - error "Failed to flash ISO" - fi -} - -iso_list_devices() { - info "Available block devices:" - echo "" - - # Show lsblk with useful columns - lsblk -o NAME,SIZE,TYPE,MOUNTPOINT,VENDOR,MODEL | grep -E "^NAME|disk|part" || true - - echo "" - info "USB devices (likely candidates):" - - # Try to identify USB devices - for dev in /dev/sd[a-z]; do - if [ -b "$dev" ]; then - # Check if it's USB via udev - if udevadm info --query=property --name="$dev" 2>/dev/null | grep -q "ID_BUS=usb"; then - local size - size=$(lsblk -ndo SIZE "$dev" 2>/dev/null || echo "unknown") - local model - model=$(lsblk -ndo MODEL "$dev" 2>/dev/null || echo "unknown") - echo " $dev - $size - $model (USB)" - fi - fi - done - - echo "" - warn "IMPORTANT: Double-check the device before flashing!" - info "Use: $0 iso flash /dev/sdX" -} - -iso_path() { - if [ -f "$ISO_RESULT_PATH" ]; then - echo "$ISO_RESULT_PATH" - local iso_size - iso_size=$(du -h "$ISO_RESULT_PATH" | cut -f1) - info "Size: $iso_size" - else - warn "ISO not found. Build it first with: $0 iso build" - exit 1 - fi -} - -secrets_command() { - if [ ! -x "$SECRETS_MANAGER" ]; then - error "Secrets manager not found at: $SECRETS_MANAGER" - fi - - # Pass all arguments directly to secrets-manager - "$SECRETS_MANAGER" "$@" -} - -validate_secrets() { - local system="${1:-}" - - if [ ! -x "$SECRETS_MANAGER" ]; then - warn "Secrets manager not available, skipping validation" - return 0 - fi - - info "Validating secrets before deployment..." - - # Check if secrets are valid - if ! "$SECRETS_MANAGER" validate &>/dev/null; then - error "Secrets validation failed. Run: $0 secrets validate" - fi - - # Check if system has secrets defined - if [ -n "$system" ]; then - # Check for system-specific secrets - if "$SECRETS_MANAGER" cat | grep -q "^$system:"; then - # Validate structure: must have maintenance_password_hash - if ! "$SECRETS_MANAGER" cat | grep -A 1 "^$system:" | grep -q "maintenance_password_hash:"; then - error "Invalid secrets structure for $system: missing maintenance_password_hash" - fi - - # Validate hash format (should be $6$ for SHA-512) - local hash_line - hash_line=$("$SECRETS_MANAGER" cat | grep -A 1 "^$system:" | grep "maintenance_password_hash:") - if [[ ! "$hash_line" =~ \$6\$ ]]; then - warn "Password hash for $system may not be SHA-512 format" - fi - - success "Secrets validated for $system" - else - warn "No secrets found for $system (may be intentional)" - fi - fi -} - -deploy_system() { - local system="${1:-}" - local validate_secrets_flag="${2:-}" - - if [ -z "$system" ]; then - error "Usage: $0 deploy [--validate-secrets]" - fi - - # Validate secrets if requested - if [ "$validate_secrets_flag" = "--validate-secrets" ]; then - validate_secrets "$system" - fi - - # Load host variables - local VARS - VARS=$(get_host_vars "$system") - local ip - ip=$(echo "$VARS" | jq -r '.network.ip // .network.hostname // ""') - - # IMPORTANT: nixos-anywhere ALWAYS uses root for initial deployment - # The configured user (jarvis, syg, etc) doesn't exist yet! - local user="root" - - if [ -z "$ip" ]; then - error "Could not determine IP/hostname for $system from config" - fi - - warn "DESTRUCTIVE OPERATION - THIS WILL WIPE THE DISK!" - read -r -p "Type 'yes' to continue: " confirm - [[ "$confirm" == "yes" ]] || { info "Cancelled"; exit 0; } - build_system "$system" - info "Deploying $system to $ip as $user (nixos-anywhere initial install)..." - nix run github:nix-community/nixos-anywhere -- --flake "$FLAKE_DIR#$system" "$user@$ip" && success "Deployment complete!" -} - -update_system() { - local system="${1:-}" - local validate_secrets_flag="${2:-}" - - if [ -z "$system" ]; then - error "Usage: $0 update [--validate-secrets]" - fi - - # Validate secrets if requested - if [ "$validate_secrets_flag" = "--validate-secrets" ]; then - validate_secrets "$system" - fi - - info "Updating $system using colmena..." - log_operation "INFO" "Updating $system with colmena" - - cd "$FLAKE_DIR" - nix run .#colmena -- apply --on "$system" --impure && success "Update complete!" -} - -update_all_systems() { - info "Updating ALL systems in parallel using colmena..." - log_operation "INFO" "Updating all systems with colmena" - - cd "$FLAKE_DIR" - nix run .#colmena -- apply --impure && success "All systems updated!" -} - -update_by_tag() { - local tag="${1:-}" - - if [ -z "$tag" ]; then - error "Usage: $0 update-tag @" - fi - - # Strip @ if provided - tag="${tag#@}" - - info "Updating systems with tag @$tag using colmena..." - log_operation "INFO" "Updating systems with tag @$tag" - - cd "$FLAKE_DIR" - nix run .#colmena -- apply --on "@$tag" --impure && success "Systems with tag @$tag updated!" -} - -vm_test() { - local system="${1:-}" - if [ -z "$system" ]; then - error "Usage: $0 vm-test " - fi - info "Building and running VM test for $system..." - nix-build "$FLAKE_DIR/tests/vm-fleet-test.nix" --argstr system "$system" - success "VM test completed for $system" -} - -main() { - if [ $# -eq 0 ]; then - usage - exit 1 - fi - - cmd="$1" - shift - - # Log the command - log_operation "INFO" "Command: $cmd $*" - - case "$cmd" in - list) - list_systems - ;; - build) - log_operation "INFO" "Building system: ${1:-}" - build_system "${1:-}" - ;; - check) - log_operation "INFO" "Checking system: ${1:-}" - check_system "${1:-}" - ;; - deploy) - log_operation "INFO" "Deploying system: ${1:-}" - deploy_system "${1:-}" "${2:-}" - ;; - update) - log_operation "INFO" "Updating system: ${1:-}" - update_system "${1:-}" "${2:-}" - ;; - update-all) - log_operation "INFO" "Updating all systems" - update_all_systems - ;; - update-tag) - log_operation "INFO" "Updating systems with tag: ${1:-}" - update_by_tag "${1:-}" - ;; - iso) - log_operation "INFO" "ISO command: $*" - iso_command "$@" - ;; - secrets) - secrets_command "$@" - ;; - vm-test) - log_operation "INFO" "VM test: ${1:-}" - vm_test "${1:-}" - ;; - *) - error "Unknown command: $cmd" - usage - ;; - esac -} - -main "$@" diff --git a/scripts/desktop/monitor-setup.sh b/scripts/desktop/monitor-setup.sh index 4ecc5d7..aecc766 100644 --- a/scripts/desktop/monitor-setup.sh +++ b/scripts/desktop/monitor-setup.sh @@ -4,7 +4,7 @@ set -euo pipefail # Configure monitors using Hyprland based on nixos/monitors.json mappings # Requires: hyprctl, jq -MONITORS_JSON="$(dirname "${BASH_SOURCE[0]}")/../../monitors.json" +MONITORS_JSON="$(dirname "${BASH_SOURCE[0]}")/../../systems/orion/monitors.json" # Exit cleanly if we're not in a graphical session (Wayland/X11) or # when this script runs in a build/container environment (e.g. during diff --git a/scripts/kanboard/README.md b/scripts/kanboard/README.md deleted file mode 100644 index 0cd280e..0000000 --- a/scripts/kanboard/README.md +++ /dev/null @@ -1,69 +0,0 @@ -# Kanboard API Scripts - -Scripts for interacting with Kanboard project management system via API. - -## Scripts - -### kanboard-api.sh -Bash wrapper for Kanboard API calls. - -**Usage:** -```bash -./scripts/kanboard/kanboard-api.sh [params] -``` - -**Example:** -```bash -./scripts/kanboard/kanboard-api.sh getMyProjects -``` - -### kanboard-api.mjs / kanboard-api.ts -Modern JavaScript/TypeScript implementations of Kanboard API client. - -**Usage:** -```bash -# Using .mjs (JavaScript) -node ./scripts/kanboard/kanboard-api.mjs [params] - -# Using .ts (TypeScript) -deno run ./scripts/kanboard/kanboard-api.ts [params] -``` - -### kb-create-project.mjs -Creates a new project in Kanboard. - -**Usage:** -```bash -node ./scripts/kanboard/kb-create-project.mjs [description] -``` - -**Example:** -```bash -node ./scripts/kanboard/kb-create-project.mjs "New Website" "Company website redesign" -``` - -### kb-create-task.mjs -Creates a new task in a Kanboard project. - -**Usage:** -```bash -node ./scripts/kanboard/kb-create-task.mjs [description] -``` - -**Example:** -```bash -node ./scripts/kanboard/kb-create-task.mjs 1 "Setup database" "Configure PostgreSQL" -``` - -## Configuration - -API credentials should be configured via environment variables: -- `KANBOARD_URL` - Your Kanboard instance URL -- `KANBOARD_API_KEY` - Your API key/token - -Or in a configuration file (implementation-specific). - -## API Documentation - -For available API methods and parameters, see: -- [Kanboard API Documentation](https://docs.kanboard.org/en/latest/api/) diff --git a/scripts/kanboard/kanboard-api.mjs b/scripts/kanboard/kanboard-api.mjs deleted file mode 100755 index 215adb9..0000000 --- a/scripts/kanboard/kanboard-api.mjs +++ /dev/null @@ -1,105 +0,0 @@ -#!/usr/bin/env node -/** - * Kanboard API helper script - * Usage: kanboard-api.mjs [params-as-json] - */ - -import { readFileSync } from 'fs'; -import { fileURLToPath } from 'url'; -import { dirname, join } from 'path'; - -const __filename = fileURLToPath(import.meta.url); -const __dirname = dirname(__filename); - -// Load environment variables from .env -function loadEnv() { - const envPath = join(__dirname, '..', '.env'); - try { - const envContent = readFileSync(envPath, 'utf8'); - const env = {}; - envContent.split('\n').forEach(line => { - line = line.trim(); - if (line && !line.startsWith('#')) { - const [key, ...valueParts] = line.split('='); - const value = valueParts.join('=').replace(/^["']|["']$/g, ''); - env[key] = value; - } - }); - return env; - } catch (err) { - return {}; - } -} - -const env = loadEnv(); - -// Configuration -const KANBOARD_URL = env.KANBOARD_URL || process.env.KANBOARD_URL || 'http://localhost/jsonrpc.php'; -const KANBOARD_USER = env.KANBOARD_USER || process.env.KANBOARD_USER || 'jsonrpc'; -const KANBOARD_TOKEN = env.KANBOARD_TOKEN || process.env.KANBOARD_TOKEN; - -// Parse command line arguments -const args = process.argv.slice(2); -const method = args[0]; -const paramsArg = args[1]; - -if (!method) { - console.error('Usage: kanboard-api.mjs [params-as-json]'); - console.error(''); - console.error('Examples:'); - console.error(' kanboard-api.mjs getAllProjects'); - console.error(' kanboard-api.mjs createProject \'{"name":"My Project"}\''); - console.error(' kanboard-api.mjs createTask \'{"project_id":1,"title":"My Task"}\''); - console.error(' kanboard-api.mjs getAllTasks \'{"project_id":1,"status_id":1}\''); - process.exit(1); -} - -if (!KANBOARD_TOKEN) { - console.error('Error: KANBOARD_TOKEN not set'); - console.error('Get your API token from: http://localhost/settings/api'); - console.error('Then add it to .env file or: export KANBOARD_TOKEN="your-token-here"'); - process.exit(1); -} - -// Parse params -let params = {}; -if (paramsArg) { - try { - params = JSON.parse(paramsArg); - } catch (err) { - console.error('Error: Invalid JSON in params:', err.message); - process.exit(1); - } -} - -// Build JSON-RPC request -const request = { - jsonrpc: '2.0', - method: method, - id: Date.now(), - params: params -}; - -// Make API call -const auth = Buffer.from(`${KANBOARD_USER}:${KANBOARD_TOKEN}`).toString('base64'); - -try { - const response = await fetch(KANBOARD_URL, { - method: 'POST', - headers: { - 'Content-Type': 'application/json', - 'Authorization': `Basic ${auth}` - }, - body: JSON.stringify(request) - }); - - const data = await response.json(); - console.log(JSON.stringify(data, null, 2)); - - if (data.error) { - process.exit(1); - } -} catch (err) { - console.error('Error:', err.message); - process.exit(1); -} diff --git a/scripts/kanboard/kanboard-api.sh b/scripts/kanboard/kanboard-api.sh deleted file mode 100755 index 8d017bc..0000000 --- a/scripts/kanboard/kanboard-api.sh +++ /dev/null @@ -1,59 +0,0 @@ -#!/usr/bin/env bash -# Kanboard API helper script -# Usage: kanboard-api.sh [params] - -set -euo pipefail - -# Load environment variables from .env if it exists -SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" -ENV_FILE="${SCRIPT_DIR}/../.env" -if [[ -f "$ENV_FILE" ]]; then - # Export variables from .env file - set -a - # shellcheck source=../.env - source "$ENV_FILE" - set +a -fi - -# Configuration -KANBOARD_URL="${KANBOARD_URL:-http://localhost/jsonrpc.php}" -KANBOARD_USER="${KANBOARD_USER:-jsonrpc}" -KANBOARD_TOKEN="${KANBOARD_TOKEN:-}" - -# Check if API token is set -if [[ -z "$KANBOARD_TOKEN" ]]; then - echo "Error: KANBOARD_TOKEN environment variable not set" - echo "Get your API token from: http://localhost/settings/api" - echo "Then: export KANBOARD_TOKEN='your-token-here'" - exit 1 -fi - -# Generate random ID for JSON-RPC -REQUEST_ID=$(date +%s) - -# Parse arguments -METHOD="${1:-}" -if [[ -z "$METHOD" ]]; then - echo "Usage: $0 [params-as-json]" - echo "" - echo "Examples:" - echo " $0 getAllProjects" - echo " $0 createProject '{\"name\":\"My Project\"}'" - echo " $0 createTask '{\"project_id\":1,\"title\":\"My Task\"}'" - echo " $0 getAllTasks '{\"project_id\":1,\"status_id\":1}'" - echo "" - echo "Set KANBOARD_TOKEN environment variable first!" - exit 1 -fi - -# Parse params (optional) -PARAMS="${2:-{}}" - -# Build the JSON request (compact, no newlines) -JSON_REQUEST='{"jsonrpc":"2.0","method":"'${METHOD}'","id":'${REQUEST_ID}',"params":'${PARAMS}'}' - -# Make API call -curl -s -u "${KANBOARD_USER}:${KANBOARD_TOKEN}" \ - -H "Content-Type: application/json" \ - -d "${JSON_REQUEST}" \ - "${KANBOARD_URL}" | jq '.' diff --git a/scripts/kanboard/kanboard-api.ts b/scripts/kanboard/kanboard-api.ts deleted file mode 100644 index 636a71a..0000000 --- a/scripts/kanboard/kanboard-api.ts +++ /dev/null @@ -1,98 +0,0 @@ -#!/usr/bin/env -S deno run --allow-net --allow-read -/** - * Kanboard API helper script (Deno version) - * Usage: kanboard-api.ts [params-as-json] - */ - -// Load environment variables from .env -function loadEnv(): Record { - const envPath = new URL('../.env', import.meta.url).pathname; - try { - const envContent = Deno.readTextFileSync(envPath); - const env: Record = {}; - envContent.split('\n').forEach(line => { - line = line.trim(); - if (line && !line.startsWith('#')) { - const [key, ...valueParts] = line.split('='); - const value = valueParts.join('=').replace(/^["']|["']$/g, ''); - env[key] = value; - } - }); - return env; - } catch { - return {}; - } -} - -const env = loadEnv(); - -// Configuration -const KANBOARD_URL = env.KANBOARD_URL || Deno.env.get('KANBOARD_URL') || 'http://localhost/jsonrpc.php'; -const KANBOARD_USER = env.KANBOARD_USER || Deno.env.get('KANBOARD_USER') || 'jsonrpc'; -const KANBOARD_TOKEN = env.KANBOARD_TOKEN || Deno.env.get('KANBOARD_TOKEN'); - -// Parse command line arguments -const args = Deno.args; -const method = args[0]; -const paramsArg = args[1]; - -if (!method) { - console.error('Usage: kanboard-api.ts [params-as-json]'); - console.error(''); - console.error('Examples:'); - console.error(' kanboard-api.ts getAllProjects'); - console.error(' kanboard-api.ts createProject \'{"name":"My Project"}\''); - console.error(' kanboard-api.ts createTask \'{"project_id":1,"title":"My Task"}\''); - console.error(' kanboard-api.ts getAllTasks \'{"project_id":1,"status_id":1}\''); - Deno.exit(1); -} - -if (!KANBOARD_TOKEN) { - console.error('Error: KANBOARD_TOKEN not set'); - console.error('Get your API token from: http://localhost/settings/api'); - console.error('Then add it to .env file or: export KANBOARD_TOKEN="your-token-here"'); - Deno.exit(1); -} - -// Parse params -let params = {}; -if (paramsArg) { - try { - params = JSON.parse(paramsArg); - } catch (err) { - console.error('Error: Invalid JSON in params:', (err as Error).message); - Deno.exit(1); - } -} - -// Build JSON-RPC request -const request = { - jsonrpc: '2.0', - method: method, - id: Date.now(), - params: params -}; - -// Make API call -const auth = btoa(`${KANBOARD_USER}:${KANBOARD_TOKEN}`); - -try { - const response = await fetch(KANBOARD_URL, { - method: 'POST', - headers: { - 'Content-Type': 'application/json', - 'Authorization': `Basic ${auth}` - }, - body: JSON.stringify(request) - }); - - const data = await response.json(); - console.log(JSON.stringify(data, null, 2)); - - if (data.error) { - Deno.exit(1); - } -} catch (err) { - console.error('Error:', (err as Error).message); - Deno.exit(1); -} diff --git a/scripts/kanboard/kb-create-project.mjs b/scripts/kanboard/kb-create-project.mjs deleted file mode 100755 index 3f809d3..0000000 --- a/scripts/kanboard/kb-create-project.mjs +++ /dev/null @@ -1,120 +0,0 @@ -#!/usr/bin/env node -/** - * Kanboard helper - Create project with auto-assignment - * Usage: kb-create-project.mjs [description] - */ - -import { fileURLToPath } from 'url'; -import { dirname, join } from 'path'; -import { readFileSync } from 'fs'; - -const __filename = fileURLToPath(import.meta.url); -const __dirname = dirname(__filename); - -// Load .env -function loadEnv() { - const envPath = join(__dirname, '..', '.env'); - try { - const envContent = readFileSync(envPath, 'utf8'); - const env = {}; - envContent.split('\n').forEach(line => { - line = line.trim(); - if (line && !line.startsWith('#')) { - const [key, ...valueParts] = line.split('='); - const value = valueParts.join('=').replace(/^["']|["']$/g, ''); - env[key] = value; - } - }); - return env; - } catch (err) { - return {}; - } -} - -const env = loadEnv(); -const KANBOARD_URL = env.KANBOARD_URL || 'http://localhost/jsonrpc.php'; -const KANBOARD_USER = env.KANBOARD_USER || 'jsonrpc'; -const KANBOARD_TOKEN = env.KANBOARD_TOKEN; -const USER_ID = 1; // Your user ID - -const args = process.argv.slice(2); -const name = args[0]; -const description = args[1] || ''; - -if (!name) { - console.error('Usage: kb-create-project.mjs [description]'); - console.error(''); - console.error('Example:'); - console.error(' kb-create-project.mjs "My Project" "Project description"'); - process.exit(1); -} - -if (!KANBOARD_TOKEN) { - console.error('Error: KANBOARD_TOKEN not set in .env'); - process.exit(1); -} - -const auth = Buffer.from(`${KANBOARD_USER}:${KANBOARD_TOKEN}`).toString('base64'); - -async function apiCall(method, params) { - const response = await fetch(KANBOARD_URL, { - method: 'POST', - headers: { - 'Content-Type': 'application/json', - 'Authorization': `Basic ${auth}` - }, - body: JSON.stringify({ - jsonrpc: '2.0', - method: method, - id: Date.now(), - params: params - }) - }); - return await response.json(); -} - -try { - // 1. Create project with owner - console.error('Creating project...'); - const createResult = await apiCall('createProject', { - name: name, - description: description, - owner_id: USER_ID - }); - - if (createResult.error) { - console.error('Error creating project:', createResult.error.message); - process.exit(1); - } - - const projectId = createResult.result; - console.error(`βœ“ Project created (ID: ${projectId})`); - - // 2. Add yourself as project member - console.error('Adding you as project manager...'); - const addUserResult = await apiCall('addProjectUser', { - project_id: projectId, - user_id: USER_ID, - role: 'project-manager' - }); - - if (addUserResult.error) { - console.error('Warning: Could not add user to project:', addUserResult.error.message); - } else { - console.error('βœ“ Added as project manager'); - } - - // 3. Get project details - const projectResult = await apiCall('getProjectById', { - project_id: projectId - }); - - console.log(JSON.stringify(projectResult.result, null, 2)); - console.error(''); - console.error(`🎯 Project "${name}" ready!`); - console.error(` Board: ${projectResult.result.url.board}`); - console.error(` List: ${projectResult.result.url.list}`); -} catch (err) { - console.error('Error:', err.message); - process.exit(1); -} diff --git a/scripts/kanboard/kb-create-task.mjs b/scripts/kanboard/kb-create-task.mjs deleted file mode 100755 index 98ab194..0000000 --- a/scripts/kanboard/kb-create-task.mjs +++ /dev/null @@ -1,146 +0,0 @@ -#!/usr/bin/env node -/** - * Kanboard helper - Create task with auto-assignment - * Usage: kb-create-task.mjs [description] [color] - */ - -import { fileURLToPath } from 'url'; -import { dirname, join } from 'path'; -import { readFileSync } from 'fs'; - -const __filename = fileURLToPath(import.meta.url); -const __dirname = dirname(__filename); - -// Load .env -function loadEnv() { - const envPath = join(__dirname, '..', '.env'); - try { - const envContent = readFileSync(envPath, 'utf8'); - const env = {}; - envContent.split('\n').forEach(line => { - line = line.trim(); - if (line && !line.startsWith('#')) { - const [key, ...valueParts] = line.split('='); - const value = valueParts.join('=').replace(/^["']|["']$/g, ''); - env[key] = value; - } - }); - return env; - } catch (err) { - return {}; - } -} - -const env = loadEnv(); -const KANBOARD_URL = env.KANBOARD_URL || 'http://localhost/jsonrpc.php'; -const KANBOARD_USER = env.KANBOARD_USER || 'jsonrpc'; -const KANBOARD_TOKEN = env.KANBOARD_TOKEN; -const USER_ID = 1; // Your user ID - -const args = process.argv.slice(2); -const projectId = parseInt(args[0]); -const title = args[1]; -const description = args[2] || ''; -const colorId = args[3] || 'blue'; - -if (!projectId || !title) { - console.error('Usage: kb-create-task.mjs <project_id> <title> [description] [color]'); - console.error(''); - console.error('Available colors: yellow, blue, green, purple, red, orange, grey, brown,'); - console.error(' deep_orange, dark_grey, pink, teal, cyan, lime, light_green, amber'); - console.error(''); - console.error('Example:'); - console.error(' kb-create-task.mjs 1 "Fix bug" "Details here" "red"'); - process.exit(1); -} - -if (!KANBOARD_TOKEN) { - console.error('Error: KANBOARD_TOKEN not set in .env'); - process.exit(1); -} - -const auth = Buffer.from(`${KANBOARD_USER}:${KANBOARD_TOKEN}`).toString('base64'); - -async function apiCall(method, params) { - const response = await fetch(KANBOARD_URL, { - method: 'POST', - headers: { - 'Content-Type': 'application/json', - 'Authorization': `Basic ${auth}` - }, - body: JSON.stringify({ - jsonrpc: '2.0', - method: method, - id: Date.now(), - params: params - }) - }); - return await response.json(); -} - -try { - // Create task with owner - console.error(`Creating task in project ${projectId}...`); - const createResult = await apiCall('createTask', { - project_id: projectId, - title: title, - description: description, - color_id: colorId, - owner_id: USER_ID - }); - - if (createResult.error) { - console.error('Error creating task:', createResult.error.message); - process.exit(1); - } - - const taskId = createResult.result; - if (!taskId) { - // If owner_id doesn't work on creation, create then update - console.error('Retrying without owner_id...'); - const retryResult = await apiCall('createTask', { - project_id: projectId, - title: title, - description: description, - color_id: colorId - }); - - if (retryResult.error) { - console.error('Error creating task:', retryResult.error.message); - process.exit(1); - } - - const newTaskId = retryResult.result; - console.error(`βœ“ Task created (ID: ${newTaskId})`); - - // Now assign it - console.error('Assigning task to you...'); - const updateResult = await apiCall('updateTask', { - id: newTaskId, - owner_id: USER_ID - }); - - if (updateResult.result) { - console.error('βœ“ Task assigned to you'); - } - - // Get task details - const taskResult = await apiCall('getTask', { task_id: newTaskId }); - console.log(JSON.stringify(taskResult.result, null, 2)); - console.error(''); - console.error(`🎯 Task "${title}" created and assigned!`); - console.error(` URL: ${taskResult.result.url}`); - } else { - console.error(`βœ“ Task created and assigned (ID: ${taskId})`); - - // Get task details - const taskResult = await apiCall('getTask', { task_id: taskId }); - console.log(JSON.stringify(taskResult.result, null, 2)); - console.error(''); - console.error(`🎯 Task "${title}" created and assigned!`); - console.error(` URL: ${taskResult.result.url}`); - } -} catch (err) { - console.error('Error:', err.message); - process.exit(1); -} diff --git a/sqlite3 b/sqlite3 deleted file mode 100644 index e69de29..0000000 diff --git a/systems/axon/default.nix b/systems/axon/default.nix index a733872..570c71c 100644 --- a/systems/axon/default.nix +++ b/systems/axon/default.nix @@ -12,6 +12,7 @@ }: let systemVars = import ./variables.nix; + fleetConfig = import ../../fleet-config.nix; inherit (systemVars.system) hostName; inherit (systemVars.user) username; in @@ -77,7 +78,8 @@ in }; # Set your time zone. - time.timeZone = "America/Los_Angeles"; + # Timezone from fleet config + time.timeZone = systemVars.system.timeZone; # Enable hardware video acceleration for media playback hardware.graphics = { @@ -148,9 +150,8 @@ in # Network configuration with Jellyfin server access networking.extraHosts = '' - 192.168.1.7 cortex.home cortex - # Add your Synology NAS IP here - # 192.168.1.X synology.home synology + ${fleetConfig.hosts.cortex.ip} cortex.home cortex + ${fleetConfig.infrastructure.nas.ip} ${fleetConfig.infrastructure.nas.fqdn} ${fleetConfig.infrastructure.nas.hostname} ''; modules = { diff --git a/systems/axon/variables.nix b/systems/axon/variables.nix index 2d4f80c..a9657c3 100644 --- a/systems/axon/variables.nix +++ b/systems/axon/variables.nix @@ -1,12 +1,29 @@ +# System-specific configuration for Axon +# +# This file contains machine-specific settings like user preferences, +# application choices, and local configuration. +# +# Network configuration (IPs, MACs, SSH) is centralized in fleet-config.nix +let + # Import centralized fleet configuration + fleetConfig = import ../../fleet-config.nix; + # Get this host's network settings + thisHost = fleetConfig.hosts.axon; +in { system = { - hostName = "axon"; + hostName = thisHost.hostname; # From fleet-config.nix + timeZone = fleetConfig.global.timeZone; # From fleet-config.nix + # Machine-specific settings }; - + user = { username = "axon"; name = "axon"; email = "axon@example.com"; # Add any other user-specific variables here }; -} \ No newline at end of file + + # Re-export network config for this host (optional, for convenience) + network = thisHost; +} diff --git a/systems/cortex/variables.nix b/systems/cortex/variables.nix index 8afab73..5465787 100644 --- a/systems/cortex/variables.nix +++ b/systems/cortex/variables.nix @@ -1,26 +1,26 @@ - # System-specific configuration for Cortex -# +# # This file contains machine-specific settings like user preferences, # application choices, and local configuration. # -# Network configuration (IPs, MACs, SSH) is centralized in network-config.nix +# Network configuration (IPs, MACs, SSH) is centralized in fleet-config.nix let # Import centralized network configuration - networkConfig = import ../../network-config.nix; + networkConfig = import ../../fleet-config.nix; # Get this host's network settings thisHost = networkConfig.hosts.cortex; in { system = { - hostName = thisHost.hostname; # From network-config.nix + hostName = thisHost.hostname; # From fleet-config.nix # Machine-specific settings # Add other system-level configs here }; user = { username = "syg"; - syncPassword = "syncmybattleship"; + # Note: Passwords should be managed via sops-nix secrets, not here + # syncPassword removed for security - use secrets management instead git = { username = "sygint"; @@ -32,7 +32,7 @@ in fileManager = "nemo"; webBrowser = "brave"; menu = "rofi -show drun"; - bar = "hyprpanel"; # or "waybar" + bar = "hyprpanel"; # or "waybar" }; }; diff --git a/systems/nexus/default.nix b/systems/nexus/default.nix index 458282c..e749ac2 100644 --- a/systems/nexus/default.nix +++ b/systems/nexus/default.nix @@ -83,10 +83,9 @@ in # ===== NAS Storage Mounts ===== # Mount Synology NAS media shares via NFS - # NAS IP: 192.168.1.136 - # Nexus will use static IP: 192.168.1.20 (configure in UDM Pro DHCP reservations) + # NAS configuration centralized in fleet-config.nix fileSystems."/mnt/nas/movies" = { - device = "192.168.1.136:/volume1/Media/Movies"; + device = "${networkConfig.infrastructure.nas.ip}:${networkConfig.infrastructure.nas.shares.movies}"; fsType = "nfs"; options = [ "x-systemd.automount" # Auto-mount on access @@ -97,7 +96,7 @@ in }; fileSystems."/mnt/nas/tvshows" = { - device = "192.168.1.136:/volume1/Media/TV Shows"; + device = "${networkConfig.infrastructure.nas.ip}:${networkConfig.infrastructure.nas.shares.tvshows}"; fsType = "nfs"; options = [ "x-systemd.automount" @@ -108,7 +107,7 @@ in }; fileSystems."/mnt/nas/music" = { - device = "192.168.1.136:/volume1/Media/Music"; + device = "${networkConfig.infrastructure.nas.ip}:${networkConfig.infrastructure.nas.shares.music}"; fsType = "nfs"; options = [ "x-systemd.automount" diff --git a/systems/nexus/default.nix.bak b/systems/nexus/default.nix.bak deleted file mode 100644 index e9a1d08..0000000 --- a/systems/nexus/default.nix.bak +++ /dev/null @@ -1,530 +0,0 @@ -# NixOS configuration for Nexus (Homelab) -# Purpose: Centralized homelab services including media, monitoring, and automation -{ config, pkgs, lib, hasSecrets, inputs, isTest ? false, ... }: - -let - systemVars = import ./variables.nix; - networkConfig = import ../../fleet-config.nix; - inherit (systemVars.system) hostName; - inherit (systemVars.user) username; -in -{ - imports = [ - ./hardware.nix - ../../modules/system.nix - ] ++ lib.optionals (!isTest) [ - ./disk-config.nix # Disko configuration for nixos-anywhere (not needed in test VMs) - ] ++ lib.optionals hasSecrets [ - (import (inputs.nixos-secrets + "/default.nix") { inherit config lib pkgs inputs hasSecrets; }) - ]; - - # Essential boot configuration - boot = { - loader = { - systemd-boot.enable = true; - efi.canTouchEfiVariables = true; - }; - }; - - networking.hostName = hostName; - time.timeZone = networkConfig.global.timeZone; - - # ===== Secrets Management ===== - # Secrets are mandatory for this system - assertions = [ - { assertion = hasSecrets; message = "Secrets requiredβ€”nixos-secrets submodule missing"; } - ]; - - # The nixos-secrets module already sets defaultSopsFile, just configure what we need here - sops = { - age.keyFile = lib.mkForce "/var/lib/sops-nix/key.txt"; # Override to use dedicated key file - - secrets."nexus/rescue_password_hash" = { - neededForUsers = true; - }; - # Leantime environment files for containers (loaded at runtime by Podman) - # These files contain MYSQL_PASSWORD and MYSQL_ROOT_PASSWORD for DB - # and LEAN_DB_PASSWORD for the app container - secrets."nexus/leantime_db_env" = { - owner = "root"; - group = "root"; - mode = "0400"; - }; - secrets."nexus/leantime_app_env" = { - owner = "root"; - group = "root"; - mode = "0400"; - }; - - secrets."nexus/grafana_admin_password" = { - owner = "grafana"; - group = "grafana"; - }; - }; - - # ===== NAS Storage Mounts ===== - # Mount Synology NAS media shares via NFS - # NAS IP: 192.168.1.136 - # Nexus will use static IP: 192.168.1.20 (configure in UDM Pro DHCP reservations) - fileSystems."/mnt/nas/movies" = { - device = "192.168.1.136:/volume1/Media/Movies"; - fsType = "nfs"; - options = [ - "x-systemd.automount" # Auto-mount on access - "noauto" # Don't mount at boot - "x-systemd.idle-timeout=600" # Unmount after 10min idle - "nfsvers=4" # Use NFSv4 - ]; - }; - - fileSystems."/mnt/nas/tvshows" = { - device = "192.168.1.136:/volume1/Media/TV Shows"; - fsType = "nfs"; - options = [ - "x-systemd.automount" - "noauto" - "x-systemd.idle-timeout=600" - "nfsvers=4" - ]; - }; - - fileSystems."/mnt/nas/music" = { - device = "192.168.1.136:/volume1/Media/Music"; - fsType = "nfs"; - options = [ - "x-systemd.automount" - "noauto" - "x-systemd.idle-timeout=600" - "nfsvers=4" - ]; - }; - - # ===== User Configuration ===== - - # Deploy user: SSH-only for remote deployments (key-only, no password) - users.users.${username} = { - isNormalUser = true; - description = "Remote deployment user (SSH key-only)"; - extraGroups = [ "wheel" "networkmanager" "jellyfin" "grafana" ]; - # SSH: Key-only authentication (no password set) - openssh.authorizedKeys.keys = [ - "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIMSdxXvx7Df+/2cPMe7C2TUSqRkYee5slatv7t3MG593 syg@nixos" - ]; - # Password is explicitly locked - this user CANNOT login at console - # This provides clear separation: SSH uses this user, console uses rescue user - # Passwordless sudo via security.sudo.wheelNeedsPassword = false - hashedPassword = "!"; # Locked account - SSH key only - }; - - # Rescue user: Console-only for physical/KVM emergency access (password-only, no SSH) - users.users.rescue = { - isNormalUser = true; - description = "Emergency console access (password-only)"; - extraGroups = [ "wheel" ]; # Can sudo for system repairs - # Password for console/KVM access via secrets - hashedPasswordFile = config.sops.secrets."nexus/rescue_password_hash".path; - # No SSH keys - this user CANNOT login remotely - # Provides audit trail: rescue user = physical access only - }; - - # ===== Nix Configuration ===== - # Deploy user needs to be trusted for remote deployments - nix.settings = { - trusted-users = [ "root" "deploy" ]; - experimental-features = [ "nix-command" "flakes" ]; - }; - - # Ensure Jellyfin can read NAS mounts - systemd.tmpfiles.rules = [ - "d /mnt/nas 0755 root root -" - "d /mnt/nas/movies 0755 root root -" - "d /mnt/nas/tvshows 0755 root root -" - "d /mnt/nas/music 0755 root root -" - ]; - - # ===== Security Configuration ===== - - # Passwordless sudo for wheel group (needed for remote deployments) - # Override the security module's default of requiring passwords - security.sudo.wheelNeedsPassword = lib.mkForce false; - - services.openssh = { - enable = true; - settings = { - PasswordAuthentication = false; - PermitRootLogin = "no"; - KbdInteractiveAuthentication = false; - }; - }; - - services.fail2ban = { - enable = true; - ignoreIP = [ - "127.0.0.1" - "192.168.1.0/24" # Local network - ]; - }; - - # ===== Core Services ===== - - # Jellyfin Media Server - Stream media from your NAS - services.jellyfin = { - enable = true; - openFirewall = true; - }; - - # Prometheus - Collect system metrics - services.prometheus = { - enable = true; - port = 9090; - - exporters = { - node = { - enable = true; - enabledCollectors = [ "systemd" ]; - port = 9100; - }; - }; - - scrapeConfigs = [ - { - job_name = "nexus"; - static_configs = [{ - targets = [ - "127.0.0.1:${toString config.services.prometheus.exporters.node.port}" - ]; - }]; - } - ]; - }; - - # Grafana - Visualize metrics with dashboards - services.grafana = { - enable = true; - settings = { - server = { - http_addr = "0.0.0.0"; - http_port = 3000; - domain = "nexus.home"; - root_url = "http://nexus.home:3000/"; - }; - security = { - admin_user = "admin"; - admin_password_file = config.sops.secrets."nexus/grafana_admin_password".path; - }; - }; - - provision = { - enable = true; - datasources.settings.datasources = [ - { - name = "Prometheus"; - type = "prometheus"; - url = "http://127.0.0.1:${toString config.services.prometheus.port}"; - isDefault = true; - } - ]; - }; - }; - - - - # Create required directories for Leantime data persistence - # Leantime storage directories - # UID/GID 1000 = www-data inside the leantime container - # These directories MUST exist before the container starts - systemd.tmpfiles.rules = [ - # Root leantime directory - "d /var/lib/leantime 0755 root root -" - - # Database data (MariaDB runs as different user) - "d /var/lib/leantime/db-data 0755 root root -" - - # Application directories (owned by container's www-data = UID 1000) - "d /var/lib/leantime/userfiles 0755 1000 1000 -" - "d /var/lib/leantime/plugins 0755 1000 1000 -" - - # Storage directory structure (required by Laravel/Leantime) - "d /var/lib/leantime/storage 0755 1000 1000 -" - "d /var/lib/leantime/storage/logs 0755 1000 1000 -" - "d /var/lib/leantime/storage/app 0755 1000 1000 -" - "d /var/lib/leantime/storage/debugbar 0755 1000 1000 -" - "d /var/lib/leantime/storage/framework 0755 1000 1000 -" - "d /var/lib/leantime/storage/framework/cache 0755 1000 1000 -" - "d /var/lib/leantime/storage/framework/cache/data 0755 1000 1000 -" - "d /var/lib/leantime/storage/framework/cache/installation 0755 1000 1000 -" - "d /var/lib/leantime/storage/framework/sessions 0755 1000 1000 -" - "d /var/lib/leantime/storage/framework/views 0755 1000 1000 -" - ]; - - virtualisation.oci-containers = { - backend = "podman"; - containers = { - # Leantime - Goals-focused PM tool (https://leantime.io/) - leantime-db = { - image = "mariadb:10.11"; - ports = [ ]; # Not exposed outside - volumes = [ - "/var/lib/leantime/db-data:/var/lib/mysql" - ]; - environment = { - MYSQL_DATABASE = "leantime"; - MYSQL_USER = "leantime"; - # Passwords loaded from environmentFiles below - }; - environmentFiles = [ - config.sops.secrets."nexus/leantime_db_env".path - ]; - autoStart = true; - }; - leantime = { - image = "leantime/leantime:latest"; - ports = [ "8080:8080" ]; - volumes = [ - "/var/lib/leantime/userfiles:/var/www/html/userfiles" - "/var/lib/leantime/plugins:/var/www/html/app/Plugins" - "/var/lib/leantime/storage:/var/www/html/storage" - ]; - environment = { - LEAN_DB_HOST = "leantime-db"; - LEAN_DB_USER = "leantime"; - LEAN_DB_DATABASE = "leantime"; - LEAN_EMAIL_RETURN = "no-reply@localhost"; - LEAN_APP_URL = "http://nexus.home:8080"; # Use flake DNS/hostname instead of localhost - # Password loaded from environmentFiles below - }; - environmentFiles = [ - config.sops.secrets."nexus/leantime_app_env".path - ]; - dependsOn = [ "leantime-db" ]; - autoStart = true; - }; - }; - }; - - # ===== Optional Services (disabled for now) ===== - # Uncomment these when you're ready to add them: - - # Home Assistant - Smart home automation - # services.home-assistant = { - # enable = true; - # extraComponents = [ "esphome" "met" "radio_browser" ]; - # config = { - # default_config = {}; - # http = { - # server_host = "0.0.0.0"; - # server_port = 8123; - # }; - # }; - # }; - - # Loki + Promtail - Log aggregation (like grep for all your logs) - # services.loki.enable = true; - # services.promtail.enable = true; - - # ===== Firewall Configuration ===== - networking.firewall = { - enable = true; - allowedTCPPorts = [ - 22 # SSH - 3000 # Grafana - 8080 # Leantime - 8096 # Jellyfin HTTP - 8920 # Jellyfin HTTPS - 9090 # Prometheus (optional - can access via Grafana) - ]; - allowedUDPPorts = [ - 1900 # DLNA/UPnP discovery - 7359 # Jellyfin discovery - ]; - }; - - # ===== Module Configuration ===== - modules = { - hardware = { - bluetooth.enable = false; # Headless server - audio.enable = false; # No local audio needed - networking = { - enable = true; - hostName = "${hostName}"; - }; - }; - - services = { - containerization.enable = true; # Podman for OCI containers - - syncthing = { - enable = false; # Enable if needed - }; - - printing = { - enable = false; # Headless server - }; - }; - - system.security = { - enable = true; # Enable security module (sudo, polkit, etc.) - serverHardening.enable = true; # Full server hardening profile (fail2ban, auditd, SSH, kernel, monitoring) - }; - }; - - # ===== Additional System Packages ===== - environment.systemPackages = with pkgs; [ - # Jellyfin packages - jellyfin - jellyfin-web - jellyfin-ffmpeg - - # Monitoring tools - htop - btop - - # Network tools - iftop - nethogs - - # System utilities - tmux - wget - curl - sqlite - - # Jellyfin utilities - libva-utils # Provides vainfo to check hardware video acceleration - - # VM testing utilities - (pkgs.writeScriptBin "vm-health-check" '' - #!${pkgs.bash}/bin/bash - set -euo pipefail - - # Colors - RED='\033[0;31m' - GREEN='\033[0;32m' - YELLOW='\033[1;33m' - BLUE='\033[0;34m' - NC='\033[0m' - - echo -e "''${BLUE}╔════════════════════════════════════════════════╗''${NC}" - echo -e "''${BLUE}β•‘ NixOS VM Health Check β•‘''${NC}" - echo -e "''${BLUE}╔════════════════════════════════════════════════╗''${NC}" - echo - - # System Info - echo -e "''${BLUE}πŸ“Š System Information''${NC}" - echo "Hostname: $(${pkgs.hostname}/bin/hostname)" - echo "Uptime: $(${pkgs.procps}/bin/uptime -p)" - echo "Kernel: $(${pkgs.coreutils}/bin/uname -r)" - echo - - # Disk Space - echo -e "''${BLUE}πŸ’Ύ Disk Space''${NC}" - ${pkgs.coreutils}/bin/df -h / | ${pkgs.coreutils}/bin/tail -n 1 - echo - - # Failed Services - echo -e "''${BLUE}πŸ” Checking for Failed Services''${NC}" - FAILED=$(${pkgs.systemd}/bin/systemctl --failed --no-pager --no-legend | ${pkgs.coreutils}/bin/wc -l) - if [ "$FAILED" -eq 0 ]; then - echo -e "''${GREEN}βœ… No failed services''${NC}" - else - echo -e "''${RED}❌ $FAILED failed service(s):''${NC}" - ${pkgs.systemd}/bin/systemctl --failed --no-pager - fi - echo - - # Network - echo -e "''${BLUE}🌐 Network Configuration''${NC}" - IP=$(${pkgs.iproute2}/bin/ip -4 addr show eth0 2>/dev/null | ${pkgs.gnugrep}/bin/grep -oP '(?<=inet\s)\d+(\.\d+){3}' || echo "N/A") - echo "IP Address: $IP" - echo - - # Service Status Check - echo -e "''${BLUE}πŸ”§ Service Status''${NC}" - - check_service() { - local service=$1 - local port=$2 - local name=$3 - - if ${pkgs.systemd}/bin/systemctl is-active --quiet "$service"; then - echo -e "''${GREEN}βœ… $name''${NC} - Running" - if [ -n "$port" ]; then - if ${pkgs.iproute2}/bin/ss -tlnp 2>/dev/null | ${pkgs.gnugrep}/bin/grep -q ":$port "; then - echo -e " ''${GREEN}β”œβ”€ Port $port listening''${NC}" - else - echo -e " ''${YELLOW}β”œβ”€ Port $port NOT listening''${NC}" - fi - fi - else - echo -e "''${RED}❌ $name''${NC} - Failed/Inactive" - if [ -n "$port" ]; then - echo -e " ''${RED}β”œβ”€ Port $port check skipped''${NC}" - fi - fi - } - - # Check services - if ${pkgs.systemd}/bin/systemctl list-unit-files | ${pkgs.gnugrep}/bin/grep -q "jellyfin.service"; then - check_service "jellyfin" "8096" "Jellyfin Media Server" - fi - if ${pkgs.systemd}/bin/systemctl list-unit-files | ${pkgs.gnugrep}/bin/grep -q "grafana.service"; then - check_service "grafana" "3000" "Grafana" - fi - if ${pkgs.systemd}/bin/systemctl list-unit-files | ${pkgs.gnugrep}/bin/grep -q "prometheus.service"; then - check_service "prometheus" "9090" "Prometheus" - fi - if ${pkgs.systemd}/bin/systemctl list-unit-files | ${pkgs.gnugrep}/bin/grep -q "prometheus-node-exporter.service"; then - check_service "prometheus-node-exporter" "9100" "Node Exporter" - fi - - echo - - # HTTP Service Tests - echo -e "''${BLUE}πŸ§ͺ HTTP Service Tests''${NC}" - - test_http() { - local port=$1 - local name=$2 - local path="''${3:-/}" - - if ${pkgs.coreutils}/bin/timeout 2 ${pkgs.curl}/bin/curl -sf "http://localhost:$port$path" > /dev/null 2>&1; then - echo -e "''${GREEN}βœ… $name''${NC} - Responding" - elif ${pkgs.coreutils}/bin/timeout 2 ${pkgs.curl}/bin/curl -sI "http://localhost:$port$path" > /dev/null 2>&1; then - echo -e "''${GREEN}βœ… $name''${NC} - Responding (HEAD only)" - else - echo -e "''${RED}❌ $name''${NC} - Not responding" - fi - } - - # Test services if ports are listening - if ${pkgs.iproute2}/bin/ss -tlnp 2>/dev/null | ${pkgs.gnugrep}/bin/grep -q ":3000 "; then - test_http "3000" "Grafana" - fi - if ${pkgs.iproute2}/bin/ss -tlnp 2>/dev/null | ${pkgs.gnugrep}/bin/grep -q ":8080 "; then - test_http "8080" "Leantime" - fi - if ${pkgs.iproute2}/bin/ss -tlnp 2>/dev/null | ${pkgs.gnugrep}/bin/grep -q ":8096 "; then - test_http "8096" "Jellyfin" - fi - if ${pkgs.iproute2}/bin/ss -tlnp 2>/dev/null | ${pkgs.gnugrep}/bin/grep -q ":9090 "; then - test_http "9090" "Prometheus" "/-/healthy" - fi - if ${pkgs.iproute2}/bin/ss -tlnp 2>/dev/null | ${pkgs.gnugrep}/bin/grep -q ":9100 "; then - test_http "9100" "Node Exporter" "/metrics" - fi - - echo - echo -e "''${BLUE}╔════════════════════════════════════════════════╗''${NC}" - echo -e "''${BLUE}β•‘ Health Check Complete β•‘''${NC}" - echo -e "''${BLUE}╔════════════════════════════════════════════════╗''${NC}" - echo - echo "Run 'journalctl -xe' to see recent logs" - echo "Run 'systemctl status <service>' for specific service details" - '') - ]; - - - - # ===== System State Version ===== - system.stateVersion = "24.11"; # Don't change this -} diff --git a/systems/orion/default.nix b/systems/orion/default.nix index 31356c6..4f1a06d 100644 --- a/systems/orion/default.nix +++ b/systems/orion/default.nix @@ -118,7 +118,7 @@ in # Add cortex to local hosts for DNS resolution (temporary until UDM DNS fixed) networking.extraHosts = '' - 192.168.1.7 cortex.home cortex + ${fleetConfig.hosts.cortex.ip} cortex.home cortex ''; modules = { @@ -310,11 +310,6 @@ in "vscode-extension-mhutchie-git-graph" ]; - nix.settings.trusted-users = [ - "root" - "syg" - ]; - # WiFi undock fix - passwordless sudo for driver reload security.sudo.extraRules = [ { diff --git a/systems/orion/monitors.json b/systems/orion/monitors.json deleted file mode 120000 index 360fe15..0000000 --- a/systems/orion/monitors.json +++ /dev/null @@ -1 +0,0 @@ -/home/syg/.config/nixos/monitors.json \ No newline at end of file diff --git a/systems/orion/monitors.json b/systems/orion/monitors.json new file mode 100644 index 0000000..1739f26 --- /dev/null +++ b/systems/orion/monitors.json @@ -0,0 +1,5 @@ +{ + "0x0BCA": "2256x1504@60, 0x0, 1", + "Sceptre M24": "1920x1080@165, 3440x1267, 1, transform, 1", + "ED343CUR V": "preferred, 0x1504, 1" +} diff --git a/test-focalboard-home.nix b/tests/test-focalboard-home.nix similarity index 100% rename from test-focalboard-home.nix rename to tests/test-focalboard-home.nix