Skip to content

Comments

feat(weller): add hardware diagnostics and MemTest86+#295

Open
patflynn wants to merge 3 commits intomainfrom
feat/hardware-diagnostics
Open

feat(weller): add hardware diagnostics and MemTest86+#295
patflynn wants to merge 3 commits intomainfrom
feat/hardware-diagnostics

Conversation

@patflynn
Copy link
Owner

Summary

  • Add MemTest86+ boot entry to systemd-boot for proper overnight memory testing
  • Remove ineffective memtest=5 kernel parameter (basic kernel memtest that runs silently with no UI)
  • Add lm_sensors and memtester to system packages for temperature monitoring and userspace memory testing

Context

Investigating system instability (game data corruption, Chrome crashes). NVMe SMART data looks clean — RAM is the primary suspect. MemTest86+ provides a thorough, standalone memory test accessible from the boot menu.

Test plan

  • NixOS rebuild completed successfully
  • Verify MemTest86+ entry appears in systemd-boot menu on next reboot
  • Run overnight MemTest86+ pass to check for RAM errors

- Refactor 'hosts/weller/default.nix' to extract hardware config to 'hosts/weller/hardware.nix'.
- Create 'modules/bootstrap.nix' for initial mutable install with SSH enabled and no secrets.
- Add 'weller-bootstrap' nixosConfiguration to 'flake.nix'.

Fixes #263
- Add smartmontools for S.M.A.R.T. monitoring and testing.
- Add nvme-cli for detailed NVMe status and error reporting.
- These tools are needed to debug recent Btrfs corruption issues on 'weller'.
Add MemTest86+ to systemd-boot menu for proper overnight memory
testing. Add lm_sensors and memtester to system packages for
hardware diagnostics.
@gemini-code-assist
Copy link

Summary of Changes

Hello @patflynn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances system stability diagnostics by integrating advanced memory testing and hardware monitoring tools into the NixOS configuration. It introduces MemTest86+ as a bootable option for in-depth RAM analysis and adds lm_sensors and memtester for real-time hardware health checks. Additionally, the configuration for the weller host has been refactored for better organization, and a new bootstrap system has been defined to simplify initial installations. These changes aim to provide robust tools for identifying and resolving system instability issues, particularly those related to memory.

Highlights

  • MemTest86+ Integration: Enabled MemTest86+ as a boot entry in systemd-boot for thorough memory diagnostics, replacing the ineffective memtest=5 kernel parameter.
  • Diagnostic Tooling: Added lm_sensors and memtester to system packages for temperature monitoring and userspace memory testing, respectively.
  • Configuration Refactoring: Extracted hardware-specific configurations for the weller host into a dedicated hardware.nix file, improving modularity and organization.
  • Bootstrap System: Introduced a weller-bootstrap system configuration in flake.nix and a corresponding bootstrap.nix module for streamlined initial installations.
Changelog
  • flake.nix
    • Defined a new weller-bootstrap NixOS system configuration for initial install targets.
  • hosts/weller/default.nix
    • Removed the lib import as it was no longer directly used in this file.
    • Replaced the import of installer/scan/not-detected.nix with ./hardware.nix.
    • Moved extensive hardware, bootloader, filesystem, networking, NVIDIA, and Bluetooth configurations to hosts/weller/hardware.nix.
  • hosts/weller/hardware.nix
    • Created this new file to encapsulate all hardware-specific configurations for the weller host.
    • Enabled memtest86 in systemd-boot.
    • Included installer/scan/not-detected.nix for hardware detection.
  • modules/bootstrap.nix
    • Created this new module to provide a minimal configuration for initial system installation.
    • Enabled SSH with root login and password authentication.
    • Configured mutable users and added default user patrick with SSH keys and sudo access without password.
  • modules/common/system.nix
    • Added smartmontools, nvme-cli, lm_sensors, and memtester to the list of system packages for hardware diagnostics.
Activity
  • NixOS rebuild completed successfully.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully adds hardware diagnostic tools like MemTest86+, lm_sensors, and memtester, improving system stability analysis, and refactors the weller host configuration into a separate hardware.nix file for better modularity. However, the modules/bootstrap.nix file introduces critical security vulnerabilities by enabling SSH root login with password authentication, setting a weak, hardcoded password (nixos) for the default user, and allowing passwordless sudo. These insecure defaults create a high risk of an attacker gaining full root control during the initial installation phase. It is strongly recommended to rely exclusively on SSH key authentication and disable password-based authentication and passwordless sudo. Additionally, there's a minor suggestion to improve separation of concerns by moving a user-space package out of the hardware configuration.

Comment on lines +24 to +25
PermitRootLogin = "yes";
PasswordAuthentication = true;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Enabling password authentication and permitting root login over SSH is highly insecure, especially when combined with a hardcoded initial password. Since SSH keys are already configured, password authentication should be disabled to prevent unauthorized access during the bootstrap phase. Permitting root login is also unnecessary when a user with sudo access is available, and it's better to rely solely on SSH key authentication.

      PermitRootLogin = "no";
  PasswordAuthentication = false;

"networkmanager"
"video"
];
initialPassword = "nixos";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Hardcoding an initial password (nixos) is a significant security risk. If password authentication is enabled, an attacker could easily gain access to this user account. Since security.sudo.wheelNeedsPassword is false, this could lead to immediate root access. This hardcoded password should be removed, relying instead on SSH key authentication for initial access. If a password is required for local console access, consider using hashedPassword with a pre-computed hash.

};

# Make it easy to assume root during bootstrap
security.sudo.wheelNeedsPassword = false;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

Allowing members of the 'wheel' group to use sudo without a password reduces the security of the system. In combination with the hardcoded password and enabled SSH password authentication, this allows an attacker to gain root access immediately upon logging in as the default user. It is safer to require a password for sudo even during the bootstrap phase.

  security.sudo.wheelNeedsPassword = true;

};
};
};
environment.systemPackages = with pkgs; [ bluetuith ];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The bluetuith package is a user-space application. While related to Bluetooth, defining environment.systemPackages here mixes hardware configuration with software installation. For better separation of concerns, it's recommended to manage host-specific packages in a different module, like hosts/weller/default.nix. This keeps hardware.nix focused purely on hardware-level settings. Please consider moving this line to hosts/weller/default.nix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant