Skip to content

Comments

feature/improve repo registration#154

Merged
bwalsh merged 42 commits intodevelopmentfrom
feature/improve-repo-registration-2
Jan 2, 2026
Merged

feature/improve repo registration#154
bwalsh merged 42 commits intodevelopmentfrom
feature/improve-repo-registration-2

Conversation

@bwalsh
Copy link
Contributor

@bwalsh bwalsh commented Dec 30, 2025

Pull request overview

See #157

This pull request introduces significant improvements to the repository registration workflow by implementing GitOps-based registration management through pull requests. The changes enable automatic creation of RepoRegistration YAML files in a dedicated GitHub repository when users configure their installations.

Key Changes

  • GitOps Registration Workflow: Implements automatic creation of pull requests in a registrations repository containing declarative YAML configuration for each repository installation
  • Vault Integration for Secrets: Adds HashiCorp Vault integration to securely store and manage S3 bucket credentials separately from the registration configuration
  • Enhanced UI for Secret Management: Implements write-only secret handling in the web UI with clear indicators when secrets have been previously saved, following security best practices documented in a new ADR

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 20 comments.

Show a summary per file
File Description
helm/argo-stack/values.yaml Adds configuration for GitHub repository URL, username, token secret, and Vault settings for gitapp-callback service; changes namespace from security to argocd
helm/argo-stack/templates/eso/externalsecret-github-app.yaml Adds new ExternalSecret resource to fetch GitHub PAT token from Vault for PR creation functionality
helm/argo-stack/templates/32-gitapp-callback.yaml Updates deployment with init containers for secret waiting and git credential setup; adds new environment variables and volume mounts for GitHub App authentication and git operations
helm/argo-stack/overlays/ingress-authz-overlay/values.yaml Updates namespace references from security to argocd for gitapp-callback service routing
gitapp-callback/app.py Implements core registration PR workflow with git operations, GitHub API integration via PyJWT and requests, and Vault client for bucket credential storage
gitapp-callback/templates/registration_form.html Enhances UI to hide previously-saved bucket credentials with timestamp indicators and re-entry toggle buttons
gitapp-callback/templates/success.html Updates navigation links to include installation_id parameter and documentation link
gitapp-callback/templates/error.html Updates navigation links consistent with success.html changes
gitapp-callback/tests/test_app.py Removes test assertions for dataBucket fields that are no longer returned from the UI
gitapp-callback/requirements.txt Adds PyJWT, cryptography, requests, and hvac dependencies for GitHub App authentication and Vault integration
gitapp-callback/Dockerfile Upgrades base image to Python 3.12 and adds git package for repository operations
gitapp-callback/.python-version Specifies Python 3.12 as the project version
gitapp-callback/.gitignore Adds venv/ directory to ignore list
gitapp-callback/.dockerignore Adds venv/ directory to Docker ignore list
gitapp-callback/README.md Adds reference to GitHub Apps documentation
docs/adr/0006-ux-secrets-mgt-on-registration.md New ADR documenting best practices for write-only secret handling in UI with sequence diagrams and security rationale
Makefile Adds check for GITHUBHAPP_CALLBACK_FLASK_KEY environment variable and helm install parameters for new gitapp-callback configuration

@calypr calypr deleted a comment from Copilot AI Dec 30, 2025
Copilot AI review requested due to automatic review settings January 1, 2026 17:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces significant improvements to the repository registration workflow by implementing GitOps-based registration management through pull requests. The changes enable automatic creation of RepoRegistration YAML files in a dedicated GitHub repository when users configure their installations.

Key Changes

  • GitOps Registration Workflow: Implements automatic creation of pull requests in a registrations repository containing declarative YAML configuration for each repository installation
  • Vault Integration for Secrets: Adds HashiCorp Vault integration to securely store and manage S3 bucket credentials separately from the registration configuration
  • Enhanced UI for Secret Management: Implements write-only secret handling in the web UI with clear indicators when secrets have been previously saved, following security best practices documented in a new ADR

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 20 comments.

Show a summary per file
File Description
helm/argo-stack/values.yaml Adds configuration for GitHub repository URL, username, token secret, and Vault settings for gitapp-callback service; changes namespace from security to argocd
helm/argo-stack/templates/eso/externalsecret-github-app.yaml Adds new ExternalSecret resource to fetch GitHub PAT token from Vault for PR creation functionality
helm/argo-stack/templates/32-gitapp-callback.yaml Updates deployment with init containers for secret waiting and git credential setup; adds new environment variables and volume mounts for GitHub App authentication and git operations
helm/argo-stack/overlays/ingress-authz-overlay/values.yaml Updates namespace references from security to argocd for gitapp-callback service routing
gitapp-callback/app.py Implements core registration PR workflow with git operations, GitHub API integration via PyJWT and requests, and Vault client for bucket credential storage
gitapp-callback/templates/registration_form.html Enhances UI to hide previously-saved bucket credentials with timestamp indicators and re-entry toggle buttons
gitapp-callback/templates/success.html Updates navigation links to include installation_id parameter and documentation link
gitapp-callback/templates/error.html Updates navigation links consistent with success.html changes
gitapp-callback/tests/test_app.py Removes test assertions for dataBucket fields that are no longer returned from the UI
gitapp-callback/requirements.txt Adds PyJWT, cryptography, requests, and hvac dependencies for GitHub App authentication and Vault integration
gitapp-callback/Dockerfile Upgrades base image to Python 3.12 and adds git package for repository operations
gitapp-callback/.python-version Specifies Python 3.12 as the project version
gitapp-callback/.gitignore Adds venv/ directory to ignore list
gitapp-callback/.dockerignore Adds venv/ directory to Docker ignore list
gitapp-callback/README.md Adds reference to GitHub Apps documentation
docs/adr/0006-ux-secrets-mgt-on-registration.md New ADR documenting best practices for write-only secret handling in UI with sequence diagrams and security rationale
Makefile Adds check for GITHUBHAPP_CALLBACK_FLASK_KEY environment variable and helm install parameters for new gitapp-callback configuration
Comments suppressed due to low confidence (1)

gitapp-callback/app.py:409

  • Missing type hint for return type. According to Python coding guidelines, all public functions should have type hints. Add return type annotation '-> None' to this function.
def init_db():
    """
    Initialize the SQLite database.
    
    Creates the registrations table if it doesn't exist.
    Table schema:
        - installation_id: TEXT PRIMARY KEY
        - data: TEXT (JSON serialized RepoRegistration)
        - created_at: TIMESTAMP
        - updated_at: TIMESTAMP
    """

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 90 to 92
return "FAILED"

return result.stdout.strip()
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unreachable code detected. The function returns "FAILED" inside the except block at line 90, but then has another return statement at line 92 that will never be executed. Remove the unreachable return statement at line 92.

Suggested change
return "FAILED"
return result.stdout.strip()
return "FAILED"

Copilot uses AI. Check for mistakes.
Comment on lines +54 to +61
set -e
if [ -z "$GIT_USERNAME" ] || [ -z "$GIT_PAT" ]; then
echo "GIT_USERNAME or GIT_PAT not set"
exit 1
fi
mkdir -p /gitcreds
CRED_STRING="https://${GIT_USERNAME}:${GIT_PAT}@github.com"
echo "${CRED_STRING}" > /gitcreds/.git-credentials
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security issue: Git credentials are written to a plain text file in the init container. The credentials file at /gitcreds/.git-credentials will contain the GitHub PAT in plain text, which is then mounted to the main container. This exposes the token unnecessarily. Consider using git credential helpers that can fetch tokens from environment variables at runtime instead of storing them in files.

Suggested change
set -e
if [ -z "$GIT_USERNAME" ] || [ -z "$GIT_PAT" ]; then
echo "GIT_USERNAME or GIT_PAT not set"
exit 1
fi
mkdir -p /gitcreds
CRED_STRING="https://${GIT_USERNAME}:${GIT_PAT}@github.com"
echo "${CRED_STRING}" > /gitcreds/.git-credentials
set -euo pipefail
if [ -z "${GIT_USERNAME:-}" ] || [ -z "${GIT_PAT:-}" ]; then
echo "GIT_USERNAME or GIT_PAT not set" >&2
exit 1
fi
mkdir -p /gitcreds
cat > /gitcreds/git-credential-env << 'EOF'
#!/bin/sh
# Git credential helper that sources credentials from environment variables.
# Expects GIT_USERNAME and GIT_PAT to be set in the environment of the git process.
action="$1"
case "${action}" in
get)
if [ -n "${GIT_USERNAME:-}" ] && [ -n "${GIT_PAT:-}" ]; then
printf 'protocol=https\n'
printf 'host=github.com\n'
printf 'username=%s\n' "${GIT_USERNAME}"
printf 'password=%s\n' "${GIT_PAT}"
fi
;;
store|erase)
# Do not persist or erase credentials; they are environment-based only.
;;
*)
# Unsupported action; exit successfully without output.
;;
esac
EOF
chmod 0700 /gitcreds
chmod 0500 /gitcreds/git-credential-env
# Optionally configure git to use this helper; callers may also
# choose to invoke it explicitly via git -c credential.helper.
git config --global credential.helper "/gitcreds/git-credential-env"

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot
create an issue

Args:
args: List of git arguments (without the leading 'git')
repo_dir: Working directory to run git in
use_auth: Whether to attach the PAT to git HTTP requests
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring mentions a parameter 'use_auth' that does not exist in the function signature. Remove this from the docstring since the function doesn't accept this parameter.

Suggested change
use_auth: Whether to attach the PAT to git HTTP requests

Copilot uses AI. Check for mistakes.
return result.stdout.strip()
except subprocess.CalledProcessError as e:
logging.error(
"Git command failed: %s\nReturn code: %s\nStdout: %s\nStderr: %s\ncwd: %s",
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging format string is incorrectly indented. The opening quote on line 82 should align with the rest of the logging.error call parameters. This creates a string formatting issue where the format string arguments don't match the placeholders.

Suggested change
"Git command failed: %s\nReturn code: %s\nStdout: %s\nStderr: %s\ncwd: %s",
"Git command failed: %s\nReturn code: %s\nStdout: %s\nStderr: %s\ncwd: %s",

Copilot uses AI. Check for mistakes.
# Default to /tmp in development/test, /var/registrations in production
DEFAULT_DB_PATH = "/tmp/registrations.sqlite" if os.environ.get("FLASK_ENV") == "development" or os.environ.get("TESTING") else "/var/registrations/registrations.sqlite"
DB_PATH = os.environ.get("DB_PATH", DEFAULT_DB_PATH)
# get current directory
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 55 uses inconsistent capitalization and punctuation. It should read "# Get current directory" to match Python comment conventions with a capital letter at the start.

Suggested change
# get current directory
# Get current directory

Copilot uses AI. Check for mistakes.
return response.json()['token']


def get_installation_repositories(installation_id: str) -> List[Dict[str, any]]:
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using lowercase 'any' instead of proper typing.Any. In type hints, use 'typing.Any' or import 'Any' from typing. The lowercase 'any' is not a valid type annotation.

Copilot uses AI. Check for mistakes.
import jwt
import time
import requests
from typing import List, Dict, Optional, Tuple
Copy link

Copilot AI Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'Optional' is not used.

Suggested change
from typing import List, Dict, Optional, Tuple
from typing import List, Dict, Tuple

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 20 changed files in this pull request and generated 13 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +100 to +101
return "FAILED"
return result.stdout.strip()
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The git command execution has unreachable code. After the try-except block returns "FAILED" on line 100, the return statement on line 101 will never be executed. Remove the duplicate return statement.

Suggested change
return "FAILED"
return result.stdout.strip()
return "FAILED"

Copilot uses AI. Check for mistakes.
# Configure author name and email
# TODO make name and email configurable
run_git_command(["config", "--global", "user.name", GITHUB_APP_NAME or "GitHub App"], REPO_ROOT)
run_git_command(["config", "--global", "user.email", f"calyr@example.com"], REPO_ROOT)
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded email address in git configuration. The email "calyr@example.com" on line 210 appears to be a placeholder. This should be made configurable via an environment variable to allow proper git commit attribution in different environments.

Copilot uses AI. Check for mistakes.
Comment on lines +70 to +100
def run_git_command(args: List[str], repo_dir: Path) -> str:
"""
Run a git command and return stdout.

Args:
args: List of git arguments (without the leading 'git')
repo_dir: Working directory to run git in
use_auth: Whether to attach the PAT to git HTTP requests
"""
cmd = ["git"]
cmd.extend(args)
try:
logging.info(" ".join(cmd))
result = subprocess.run(
cmd,
cwd=repo_dir,
check=True,
capture_output=True,
text=True,
)
return result.stdout.strip()
except subprocess.CalledProcessError as e:
logging.error(
"Git command failed: %s\nReturn code: %s\nStdout: %s\nStderr: %s\ncwd: %s",
" ".join(cmd),
e.returncode,
e.stdout,
e.stderr,
repo_dir,
)
return "FAILED"
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling in the git command runner swallows the exception and returns "FAILED" as a string. This makes error detection fragile since callers must check for the magic string "FAILED" instead of using proper exception handling. Consider raising the CalledProcessError or a custom exception instead of returning a string sentinel value.

Copilot uses AI. Check for mistakes.

# Git Authentication
GIT_CREDENTIALS_FILE = os.environ.get("GIT_CREDENTIALS_FILE") # Path to git credentials file (e.g., /root/.git-credentials) for git credential.helper

Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing validation for required environment variables. The code uses VAULT_ADDR, VAULT_TOKEN, GITHUB_PAT, REPO_REGISTRATION, etc. but only checks them at usage time. Consider validating all required environment variables at startup and failing fast with clear error messages rather than encountering errors during request handling.

Suggested change
# Validate required environment variables at startup to fail fast on misconfiguration
_REQUIRED_ENV_VARS = {
"GITHUB_APP_NAME": GITHUB_APP_NAME,
"GITHUB_APP_ID": GITHUB_APP_ID,
"GITHUB_PRIVATE_KEY_PATH": GITHUB_PRIVATE_KEY_PATH,
"GITHUB_PAT": GITHUB_PAT,
"REPO_REGISTRATION": REPO_REGISTRATION,
"VAULT_ADDR": VAULT_ADDR,
"VAULT_TOKEN": VAULT_TOKEN,
"GIT_CREDENTIALS_FILE": GIT_CREDENTIALS_FILE,
}
_missing_env_vars = [name for name, value in _REQUIRED_ENV_VARS.items() if not value]
if _missing_env_vars:
logger.critical(
"Missing required environment variables: %s",
", ".join(sorted(_missing_env_vars)),
)
raise RuntimeError(
"Missing required environment variables: "
+ ", ".join(sorted(_missing_env_vars))
)

Copilot uses AI. Check for mistakes.
Comment on lines +37 to +42
echo "Waiting for secret {{ .Values.githubStatusProxy.privateKeySecret.name }} to be available..."
until [ -f /etc/github/private-key.pem ]; do
echo "Secret not found, waiting..."
sleep 2
done
echo "Secret found, proceeding with startup"
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The init container checks for the wrong secret name. Line 37 waits for 'githubStatusProxy.privateKeySecret.name' but this is the gitapp-callback deployment which should wait for its own secrets, not the githubStatusProxy secrets. This could cause the init container to wait indefinitely or fail to detect missing secrets.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot
Create an issue

Comment on lines +374 to +388
try:
reviewers_response = requests.post(
reviewers_url,
headers=headers,
json={"reviewers": ["Copilot"]},
)
reviewers_response.raise_for_status()
logger.info(f"Successfully added co-pilot as reviewer to PR #{pr_number}")
except requests.HTTPError as e:
# Log warning but don't fail the entire operation
logger.warning(reviewers_url)
logger.warning(
f"Failed to add reviewer to PR #{pr_number}: {e.response.status_code} - {e.response.text}"
)
raise
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded reviewer "Copilot" on line 378 may not exist in all repositories. This will cause the PR creation to fail with the current error handling that re-raises the exception. Consider making the reviewer configurable or handling the 404/422 error gracefully without failing the entire registration operation.

Copilot uses AI. Check for mistakes.
Comment on lines +243 to +250
try:
ref = run_git_command(
["symbolic-ref", "refs/remotes/origin/HEAD"],
repo_dir,
)
return ref.split("/")[-1]
except subprocess.CalledProcessError:
return "main"
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subprocess.CalledProcessError is caught but a bare 'except Exception' is used on line 249 for the fallback case. This will catch the subprocess.CalledProcessError that was already handled, making the second handler unreachable. The exception handling logic appears incorrect.

Copilot uses AI. Check for mistakes.
Comment on lines +732 to +733
client = hvac.Client(url=VAULT_ADDR, token=VAULT_TOKEN)

Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Vault client is created without SSL/TLS verification. While the VAULT_ADDR uses HTTP in the values file, production deployments should use HTTPS with proper certificate validation. Consider adding a configuration option for TLS verification and warning when running with verify=False.

Suggested change
client = hvac.Client(url=VAULT_ADDR, token=VAULT_TOKEN)
vault_addr = VAULT_ADDR
vault_token = VAULT_TOKEN
vault_tls_verify_env = os.environ.get("VAULT_TLS_VERIFY")
if vault_tls_verify_env is None:
parsed_addr = urlparse(vault_addr) if vault_addr else None
vault_verify = parsed_addr.scheme == "https" if parsed_addr else True
else:
vault_verify = vault_tls_verify_env.lower() not in ("false", "0", "no")
if not vault_verify:
logger.warning(
"Vault TLS verification is disabled (VAULT_TLS_VERIFY=%r, VAULT_ADDR=%s). "
"This configuration should not be used in production.",
vault_tls_verify_env,
vault_addr,
)
client = hvac.Client(url=vault_addr, token=vault_token, verify=vault_verify)

Copilot uses AI. Check for mistakes.
Comment on lines +93 to +98
"Git command failed: %s\nReturn code: %s\nStdout: %s\nStderr: %s\ncwd: %s",
" ".join(cmd),
e.returncode,
e.stdout,
e.stderr,
repo_dir,
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete error message formatting. The log message on line 93 uses old-style string formatting with %s placeholders but the actual formatting arguments are on lines 94-98. This creates a multi-line logging call that could be improved for readability. Consider using f-strings or ensuring the formatting call is on a single logical line.

Suggested change
"Git command failed: %s\nReturn code: %s\nStdout: %s\nStderr: %s\ncwd: %s",
" ".join(cmd),
e.returncode,
e.stdout,
e.stderr,
repo_dir,
"Git command failed:\n"
f" Command: {' '.join(cmd)}\n"
f" Return code: {e.returncode}\n"
f" Stdout: {e.stdout}\n"
f" Stderr: {e.stderr}\n"
f" cwd: {repo_dir}"

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Copilot AI commented Jan 2, 2026

@bwalsh I've opened a new pull request, #158, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Contributor

Copilot AI commented Jan 2, 2026

@bwalsh I've opened a new pull request, #159, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Contributor

Copilot AI commented Jan 2, 2026

@bwalsh I've opened a new pull request, #160, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Contributor

Copilot AI commented Jan 2, 2026

@bwalsh I've opened a new pull request, #161, to work on those changes. Once the pull request is ready, I'll request review from you.

@bwalsh bwalsh merged commit 4372fa6 into development Jan 2, 2026
1 of 3 checks passed
@bwalsh bwalsh deleted the feature/improve-repo-registration-2 branch January 2, 2026 19:54
@bwalsh bwalsh restored the feature/improve-repo-registration-2 branch January 5, 2026 21:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants