Skip to content

Conversation

@nttg8100
Copy link
Member

@nttg8100 nttg8100 commented Jan 7, 2026

This pull request introduces several important security and reliability improvements to job submission and credential handling, as well as updates to job status tracking and error handling. The main focus is on preventing sensitive storage credentials from being exposed through Redis/Celery, improving error reporting for HPC connection issues, and cleaning up credential files after job completion or failure.

Security and Credential Handling:

  • Storage credentials are no longer passed through Redis/Celery; instead, only the storage_id and user_email are sent to the worker, which then regenerates temporary credentials just-in-time for job execution. This prevents sensitive information from being exposed in transit. [1] [2] [3] [4]
  • The token field in the Github model is now stored using EncryptedTextField, and a migration updates its database type to TEXT for better security and compatibility. [1] [2]

Job Status and Error Handling:

  • The job status enum replaces SERVER_ERROR with HPC_DISCONNECTED to more accurately reflect HPC connection failures, and this change is reflected throughout the codebase and migrations. [1] [2] [3] [4] [5]
  • Improved error handling and logging for SSH tunnel deletion, including more robust process checks and clearer log messages if no process is found.

Credential Cleanup:

  • After job monitoring finishes, the worker attempts to clean up credential files from the HPC environment to ensure no sensitive data is left behind, even if the job fails before automatic cleanup.

Database and Migration Updates:

  • Database migrations update job status column comments and the github.token column type to align with the new logic and security improvements. [1] [2]

Let me know if you want a deeper walkthrough of any specific change!

@nttg8100 nttg8100 requested a review from Copilot January 7, 2026 15:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances security by preventing sensitive storage credentials from being exposed through Redis/Celery, improves HPC connection error reporting with a new status type, and adds cleanup of credential files after job completion or failure.

Key Changes:

  • Storage credentials are now regenerated just-in-time in the worker instead of being passed through Redis/Celery
  • Job status SERVER_ERROR replaced with HPC_DISCONNECTED for more accurate HPC connection failure reporting
  • GitHub tokens now use EncryptedTextField for better security

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
backend/app/service_job/controller.py Validates storage credentials exist but defers generation to worker; passes storage_id and user_email instead of credentials
backend/app/service_job/tasks.py Regenerates temporary credentials in worker; adds credential cleanup after job completion; updates error status to HPC_DISCONNECTED
backend/app/service_job/models.py Replaces SERVER_ERROR with HPC_DISCONNECTED status
backend/app/utils/executor/ssh.py Adds HPC_DISCONNECTED to staging statuses; improves tunnel deletion error handling with process tracking
backend/app/service_credential/models/personal.py Changes Github token field to EncryptedTextField
backend/migrations/models/5_20260105205918_update.py Updates database comments to reflect HPC_DISCONNECTED status
backend/migrations/models/6_20260105224743_update.py Migrates github.token column from VARCHAR(200) to TEXT for encrypted storage
Comments suppressed due to low confidence (1)

backend/app/service_job/tasks.py:1

  • The task call uses positional arguments which makes it fragile and hard to maintain. If parameter order changes in the task signature, this call will break silently. Consider using keyword arguments: submit_job.apply_async(args=[str(job.id), analysis.allow_access, params], kwargs={'storage_id': str(project.storage.pk), 'user_email': user.email, 'duration': job.time})
from celery import shared_task

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

id = fields.UUIDField(primary_key=True, default=uuid.uuid4)
username = fields.CharField(max_length=50, null=False)
token = fields.CharField(max_length=200, null=False)
token = EncryptedTextField(max_length=200, null=False)
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The max_length parameter on an EncryptedTextField may not behave as expected since encrypted data will be larger than the original plaintext. Encrypted text typically expands beyond the original length due to encryption overhead. Consider removing max_length or ensuring the field can accommodate the encrypted size.

Suggested change
token = EncryptedTextField(max_length=200, null=False)
token = EncryptedTextField(null=False)

Copilot uses AI. Check for mistakes.
if not temp_secrets:
return bad_request({"message": "The ARN failed to create"})
if not storage_manager.is_valid():
return bad_request({"message": "Storage credentials are not valid"})
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message 'Storage credentials are not valid' is vague and doesn't help users understand what action to take. Consider providing a more specific message, such as 'Storage credentials are missing or expired. Please reconfigure storage settings.'

Suggested change
return bad_request({"message": "Storage credentials are not valid"})
return bad_request({"message": "Storage credentials are missing, invalid, or expired. Please review and reconfigure the project's storage settings."})

Copilot uses AI. Check for mistakes.
@nttg8100 nttg8100 merged commit 3882aed into dev Jan 7, 2026
3 checks passed
@nttg8100 nttg8100 self-assigned this Jan 7, 2026
@nttg8100 nttg8100 added enhancement New feature or request bug Something isn't working labels Jan 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants