Skip to content

Conversation

@junaway
Copy link
Contributor

@junaway junaway commented Dec 20, 2025

No description provided.

Copilot AI review requested due to automatic review settings December 20, 2025 00:10
@vercel
Copy link

vercel bot commented Dec 20, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Review Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jan 2, 2026 10:22am

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements and tests Daytona-based code evaluation functionality, transitioning from the legacy local sandbox to a new SDK-based approach. It includes improvements to code editor indentation handling for Python/code blocks and adds example evaluators for testing various dependencies and API endpoints.

Key Changes

  • Replaced legacy custom_code_run with new sdk_custom_code_run that uses the SDK's workflow-based evaluator system
  • Enhanced code editor to preserve exact indentation for Python/code (no transformations) while maintaining space-to-tab conversion for JSON/YAML
  • Added example evaluators for testing OpenAI, NumPy, and Agenta API endpoints in Daytona environments

Reviewed changes

Copilot reviewed 20 out of 25 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
api/oss/src/services/evaluators_service.py Implements new SDK-based custom code runner function that delegates to workflow system
api/oss/src/resources/evaluators/evaluators.py Updates default code template with deprecation note for app_params
sdk/agenta/sdk/workflows/runners/daytona.py Adds environment variables (OPENAI_API_KEY, AGENTA_HOST, AGENTA_CREDENTIALS) to sandbox
sdk/agenta/sdk/workflows/runners/local.py Exposes built-in Python types (dict, list, str, etc.) to restricted environment
sdk/agenta/sdk/decorators/running.py Adds fallback to request.credentials in credential resolution chain
web/oss/src/components/Editor/plugins/code/utils/pasteUtils.ts Preserves exact indentation for Python/code, converts spaces to tabs for JSON/YAML
web/oss/src/components/Editor/plugins/code/plugins/IndentationPlugin.tsx Uses 4 spaces for Python/code tab insertion, 2 spaces for JSON/YAML
web/oss/src/components/Editor/plugins/code/plugins/AutoFormatAndValidateOnPastePlugin.tsx Skips indentation transformation for Python/code, maintains it for JSON/YAML
examples/python/evaluators/openai/*.py Adds OpenAI SDK evaluators for testing API availability and exact match comparisons
examples/python/evaluators/numpy/*.py Adds NumPy evaluators for testing library availability and character counting
examples/python/evaluators/basic/*.py Adds basic evaluators using Python stdlib for string matching, length checks, JSON validation
examples/python/evaluators/ag/*.py Adds Agenta API endpoint evaluators for health, secrets, and config endpoints
examples/python/evaluators/*.md Provides comprehensive documentation (README, QUICKSTART, SUMMARY) for evaluators

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings December 23, 2025 11:39
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 32 out of 37 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Add standard provider keys from vault as env vars
Add templates
Fix credentials (and thus secrets and traces) in evaluator playground
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 182 out of 299 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

runtime = runtime or "python"

# Select general snapshot
snapshot_id = os.getenv("DAYTONA_SNAPSHOT")
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The environment variable name changed from AGENTA_SERVICES_SANDBOX_SNAPSHOT_PYTHON to DAYTONA_SNAPSHOT, but this is inconsistent with the naming pattern used elsewhere (e.g., AGENTA_HOST, AGENTA_API_URL). Consider using AGENTA_DAYTONA_SNAPSHOT or documenting why the AGENTA_ prefix was dropped for this variable.

Copilot uses AI. Check for mistakes.
Comment on lines +111 to +115

def _run_file(daytona: Daytona, runtime: str, path: Path) -> None:
code = path.read_text(encoding="utf-8")
wrapped = _wrap_python(code) if runtime == "python" else _wrap_js(code)

Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sandbox creation doesn't specify a snapshot ID, but the _create_sandbox method in daytona.py requires DAYTONA_SNAPSHOT to be set. This will fail if the environment variable is not configured. Consider adding explicit snapshot configuration or error handling.

Suggested change
def _run_file(daytona: Daytona, runtime: str, path: Path) -> None:
code = path.read_text(encoding="utf-8")
wrapped = _wrap_python(code) if runtime == "python" else _wrap_js(code)
def _require_daytona_snapshot() -> str:
"""Ensure that DAYTONA_SNAPSHOT is configured before creating sandboxes."""
snapshot = os.getenv("DAYTONA_SNAPSHOT")
if not snapshot:
raise RuntimeError(
"DAYTONA_SNAPSHOT is required to create Daytona sandboxes. "
"Please set the environment variable to a valid snapshot ID."
)
return snapshot
def _run_file(daytona: Daytona, runtime: str, path: Path) -> None:
code = path.read_text(encoding="utf-8")
wrapped = _wrap_python(code) if runtime == "python" else _wrap_js(code)
# Validate that the required snapshot configuration is present before creating a sandbox.
_require_daytona_snapshot()

Copilot uses AI. Check for mistakes.
Comment on lines +103 to +109
tracing_ctx = TracingContext.get()
tracing_ctx.credentials = credentials

with running_context_manager(RunningContext.get()):
running_ctx = RunningContext.get()
running_ctx.credentials = f"Secret {secret_token}"
ctx = RunningContext.get()
ctx.credentials = credentials

with tracing_context_manager(tracing_ctx):
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context objects are retrieved and modified before being passed to context managers. This pattern could lead to issues if the contexts are modified elsewhere between get() and the context manager entry. Consider retrieving fresh contexts inside the managers or ensuring contexts are isolated.

Copilot uses AI. Check for mistakes.
Comment on lines 403 to 405
const response = await axios.post(
`${getAgentaApiUrl()}/testsets/revisions/${revisionId}/archive?project_id=${projectId}`,
)

Check failure

Code scanning / CodeQL

Server-side request forgery Critical test

The
URL
of this request depends on a
user-provided value
.

Copilot Autofix

AI 8 days ago

General fix: Ensure that the user-controlled revisionId is validated/normalized on the client before being interpolated into the URL path. Reject values that are not in an expected safe format (e.g., a UUID or a restricted ID pattern), and avoid letting path traversal sequences or reserved URL meta-characters be passed through. If invalid, throw or refuse to make the request.

Best concrete fix in this code: In web/oss/src/services/testsets/api/index.ts, in archiveTestsetRevision, validate revisionId before constructing the URL. A minimal and safe approach is:

  • Introduce a small local validator (e.g., isSafeRevisionId) in this file that enforces a strict pattern (e.g., only letters, digits, hyphen, underscore, and limited length).
  • Call this validator at the top of archiveTestsetRevision. If the ID is invalid, throw an error instead of making the HTTP request.
  • Use encodeURIComponent when interpolating revisionId into the URL path, to prevent any unexpected interpretation of characters.

This keeps the current API shape and behavior for valid IDs, while making it impossible for a malicious query parameter to inject dangerous characters or path segments into the URL used by axios.post. No changes are necessary to the calling code in useTestcaseActions other than benefiting from the safer implementation.

Concretely:

  • In web/oss/src/services/testsets/api/index.ts, add a small helper function isSafeRevisionId near the archiveTestsetRevision function.
  • In archiveTestsetRevision, before using revisionId, check if (!isSafeRevisionId(revisionId)) throw new Error("Invalid revision ID").
  • When building the URL, wrap revisionId with encodeURIComponent(revisionId).

No imports are needed; we only use built-in RegExp and encodeURIComponent.


Suggested changeset 1
web/oss/src/services/testsets/api/index.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/web/oss/src/services/testsets/api/index.ts b/web/oss/src/services/testsets/api/index.ts
--- a/web/oss/src/services/testsets/api/index.ts
+++ b/web/oss/src/services/testsets/api/index.ts
@@ -397,11 +397,23 @@
  * @param revisionId - The ID of the revision to archive
  * @returns The archived revision data
  */
+function isSafeRevisionId(revisionId: string): boolean {
+    // Allow only typical ID characters; adjust pattern if backend uses a stricter format (e.g., UUID)
+    // This prevents path traversal and other special characters from being used in the URL path segment.
+    return /^[A-Za-z0-9_-]{1,128}$/.test(revisionId)
+}
+
 export async function archiveTestsetRevision(revisionId: string) {
+    if (!isSafeRevisionId(revisionId)) {
+        throw new Error("Invalid revision ID")
+    }
+
     const {projectId} = getProjectValues()
 
+    const safeRevisionId = encodeURIComponent(revisionId)
+
     const response = await axios.post(
-        `${getAgentaApiUrl()}/testsets/revisions/${revisionId}/archive?project_id=${projectId}`,
+        `${getAgentaApiUrl()}/testsets/revisions/${safeRevisionId}/archive?project_id=${projectId}`,
     )
 
     return response.data
EOF
@@ -397,11 +397,23 @@
* @param revisionId - The ID of the revision to archive
* @returns The archived revision data
*/
function isSafeRevisionId(revisionId: string): boolean {
// Allow only typical ID characters; adjust pattern if backend uses a stricter format (e.g., UUID)
// This prevents path traversal and other special characters from being used in the URL path segment.
return /^[A-Za-z0-9_-]{1,128}$/.test(revisionId)
}

export async function archiveTestsetRevision(revisionId: string) {
if (!isSafeRevisionId(revisionId)) {
throw new Error("Invalid revision ID")
}

const {projectId} = getProjectValues()

const safeRevisionId = encodeURIComponent(revisionId)

const response = await axios.post(
`${getAgentaApiUrl()}/testsets/revisions/${revisionId}/archive?project_id=${projectId}`,
`${getAgentaApiUrl()}/testsets/revisions/${safeRevisionId}/archive?project_id=${projectId}`,
)

return response.data
Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
@junaway junaway force-pushed the chore/check-daytona-code-evaluator branch from 1562c7d to 59a6e6b Compare December 25, 2025 18:06
Copilot AI review requested due to automatic review settings December 25, 2025 18:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 183 out of 299 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

from openai import AsyncOpenAI

# COMMENTED OUT: autoevals dependency removed
# from autoevals.ragas import Faithfulness, ContextRelevancy
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'Relevancy' to 'Relevance' in the comment.

Suggested change
# from autoevals.ragas import Faithfulness, ContextRelevancy
# from autoevals.ragas import Faithfulness, ContextRelevancy # Commented out due to autoevals removal, corrected spelling of 'Relevance'

Copilot uses AI. Check for mistakes.
Comment on lines +108 to +109
// Get the actual language from the CodeBlock node, or default to "code"
const language = $isCodeBlockNode(parentBlock) ? parentBlock.getLanguage() : "code"
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback to 'code' when parentBlock is not a CodeBlockNode may mask errors. Consider logging a warning or throwing an error if the parent is unexpectedly not a CodeBlockNode, as this likely indicates a programming error.

Suggested change
// Get the actual language from the CodeBlock node, or default to "code"
const language = $isCodeBlockNode(parentBlock) ? parentBlock.getLanguage() : "code"
// Get the actual language from the CodeBlock node, or default to "code".
// If parentBlock is not a CodeBlockNode, log a warning as this likely indicates
// a structural/editor bug, but still fall back to "code" to preserve behavior.
let language: string
if ($isCodeBlockNode(parentBlock)) {
language = parentBlock.getLanguage()
} else {
log("Paste: Expected parentBlock to be a CodeBlockNode", {
selection,
anchorNode,
currentLine,
parentBlock,
})
language = "code"
}

Copilot uses AI. Check for mistakes.
Comment on lines +38 to +41
# Local runner only supports Python
if runtime != "python":
raise ValueError(
f"LocalRunner only supports 'python' runtime, got: {runtime}"
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing RestrictedPython eliminates sandboxing protections. The local runner now executes arbitrary Python code without restrictions. This is a significant security regression if untrusted code can be executed. Ensure that the local runner is only used in trusted development environments and that production deployments use the Daytona runner.

Copilot uses AI. Check for mistakes.
agenta_credentials = (
RunningContext.get().credentials
#
or ""
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String slicing agenta_credentials[7:] assumes 'ApiKey ' prefix is exactly 7 characters. However, the check is for 'ApiKey ' (with space), which is also 7 characters, so this is correct. But if the prefix format changes (e.g., 'ApiKey ' with two spaces), this will fail silently. Consider using agenta_credentials.removeprefix('ApiKey ') for robustness.

Suggested change
or ""
agenta_credentials.removeprefix("ApiKey ")

Copilot uses AI. Check for mistakes.
// Insert spaces instead of tab character
// Use 4 spaces for Python/code (PEP 8 standard)
// Use 2 spaces for JSON/YAML (typical formatting)
const spaces = language === "json" || language === "yaml" ? " " : " "
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider extracting these magic numbers (2 spaces for JSON/YAML, 4 spaces for code/Python/JavaScript/TypeScript) into named constants at the module level. This would make the indentation standards more visible and easier to modify consistently across the codebase.

Copilot uses AI. Check for mistakes.
for runtime, folder in BASIC_DIRS.items():
if not folder.exists():
continue
pattern = "*.py" if runtime == "python" else "*.js" if runtime == "javascript" else "*.ts"
Copy link

Copilot AI Dec 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This nested ternary expression is difficult to read. Consider using a dictionary mapping or if-elif-else structure for better clarity.

Suggested change
pattern = "*.py" if runtime == "python" else "*.js" if runtime == "javascript" else "*.ts"
if runtime == "python":
pattern = "*.py"
elif runtime == "javascript":
pattern = "*.js"
else:
pattern = "*.ts"

Copilot uses AI. Check for mistakes.
@junaway junaway changed the title [feat] Add daytona code evaluators [feat] Add DaytonaRunner for code evaluators Dec 25, 2025
Copilot AI review requested due to automatic review settings December 26, 2025 16:20
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 168 out of 310 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

} from "@/oss/state/testsetSelection"

/**
* Testset Queries - Clean atom-based data fetching
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'recieve' to 'receive' in comment.

Copilot uses AI. Check for mistakes.
}
}, [parsed.fullValue])

const isPdf = mimeType === "application/pdf"
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable isPdf is declared but never used in the component. Consider removing it or using it in the conditional rendering logic if PDF-specific behavior is intended.

Copilot uses AI. Check for mistakes.
@junaway junaway marked this pull request as draft December 29, 2025 10:27
@junaway junaway changed the base branch from main to frontend-feat/new-testsets-integration December 29, 2025 10:27
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 48 out of 55 changed files in this pull request and generated 20 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 20 to +21
"""
Execute provided Python code safely using RestrictedPython.
Execute provided Python code directly.
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LocalRunner now executes code directly without any sandboxing or restrictions, but the docstring still references "safe execution". The comment on line 8 says "Local code runner using direct Python execution" which is accurate, but the run method docstring should be updated to reflect that this is NOT safe execution and should only be used in trusted environments.

Copilot uses AI. Check for mistakes.
Comment on lines +36 to +38
Execute the provided code safely.
Uses the configured runner (local RestrictedPython or remote Daytona)
Uses the configured runner (local or remote Daytona)
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring for execute_code_safely still says the function executes code "safely", but with the LocalRunner now using direct exec() without restrictions, this is misleading. The function name and docstring should be updated to reflect that safety depends on the runner implementation, and LocalRunner is not actually safe.

Copilot uses AI. Check for mistakes.
Comment on lines +248 to +251
// NO transformation for Python/code - keep indent exactly as-is
// Just add the indent as a plain text node (preserves spaces AND tabs)
if (indent.length > 0) {
codeLine.append($createCodeHighlightNode(indent, "plain", false, null))
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 248 says "NO transformation for Python/code - keep indent exactly as-is" which is accurate, but then the comment on line 249 says "Just add the indent as a plain text node (preserves spaces AND tabs)". These could be combined into a single, clearer comment explaining that for Python/JS/TS, indentation is preserved exactly as pasted (both spaces and tabs) by inserting it as a plain text node.

Copilot uses AI. Check for mistakes.
runtime = runtime or "python"

# Select general snapshot
snapshot_id = os.getenv("DAYTONA_SNAPSHOT")
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The environment variable name has changed from AGENTA_SERVICES_SANDBOX_SNAPSHOT_PYTHON to DAYTONA_SNAPSHOT. This appears to be a breaking change that could affect existing deployments. Consider either maintaining backward compatibility by checking both variable names, or documenting this breaking change clearly in migration notes.

Copilot uses AI. Check for mistakes.
if response_error:
log.error(f"Sandbox execution error: {response_error}")
raise RuntimeError(f"Sandbox execution failed: {response_error}")
if response_exit_code and response_exit_code != 0:
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code checks if response_exit_code is truthy before checking if it's non-zero. However, if exit_code is 0 (success), the expression response_exit_code and response_exit_code != 0 would be False (correct). But if exit_code is None (when the attribute doesn't exist), this would also be False, potentially masking errors. Consider explicitly checking if response_exit_code is not None and response_exit_code != 0 to distinguish between "no exit code" and "exit code is 0".

Copilot uses AI. Check for mistakes.
Comment on lines +309 to +332
"code": "from typing import Dict, Union, Any\n\n\ndef evaluate(\n app_params: Dict[str, str], # deprecated; currently receives {}\n inputs: Dict[str, str],\n output: Union[str, Dict[str, Any]],\n correct_answer: str,\n) -> float:\n if output == correct_answer:\n return 1.0\n return 0.0\n",
},
"description": "Exact match evaluator implemented in Python.",
},
{
"key": "javascript_default",
"name": "Exact Match (JavaScript)",
"values": {
"requires_llm_api_keys": False,
"runtime": "javascript",
"correct_answer_key": "correct_answer",
"code": 'function evaluate(appParams, inputs, output, correctAnswer) {\n void appParams\n void inputs\n\n const outputStr =\n typeof output === "string" ? output : JSON.stringify(output)\n\n return outputStr === String(correctAnswer) ? 1.0 : 0.0\n}\n',
},
"description": "Exact match evaluator implemented in JavaScript.",
},
{
"key": "typescript_default",
"name": "Exact Match (TypeScript)",
"values": {
"requires_llm_api_keys": False,
"runtime": "typescript",
"correct_answer_key": "correct_answer",
"code": 'type OutputValue = string | Record<string, unknown>\n\nfunction evaluate(\n app_params: Record<string, string>,\n inputs: Record<string, string>,\n output: OutputValue,\n correct_answer: string\n): number {\n void app_params\n void inputs\n\n const outputStr =\n (typeof output === "string" ? output : JSON.stringify(output)) as string\n\n return outputStr === String(correct_answer) ? 1.0 : 0.0\n}\n',
},
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The preset code values are stored as long single-line strings with embedded newlines (\n). This makes the code difficult to read and maintain in the resource file. Consider using multiline strings or loading these presets from separate files to improve readability and maintainability of the evaluator preset code.

Copilot uses AI. Check for mistakes.
Comment on lines +249 to +250
response = sandbox.process.code_run(wrapped_code)
response_stdout = response.result if hasattr(response, "result") else ""
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The response handling uses response.result as the stdout content on line 250, but the production code comment history shows that previously it was response.stdout. The test script on line 119 also uses resp.result. However, there's no clear documentation about the Daytona API version being used. Consider documenting which Daytona SDK version this code is compatible with to avoid confusion about the correct attribute names.

Copilot uses AI. Check for mistakes.
Comment on lines 134 to 138
if not snapshot_id:
raise RuntimeError(
"AGENTA_SERVICES_SANDBOX_SNAPSHOT_PYTHON environment variable is required. "
"Set it to the Daytona sandbox ID or snapshot name you want to use."
f"No Daytona snapshot configured for runtime '{runtime}'. "
f"Set DAYTONA_SNAPSHOT environment variable."
)
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message references runtime variable but uses a generic message format. When DAYTONA_SNAPSHOT is not set, the error message says "No Daytona snapshot configured for runtime '{runtime}'", but the snapshot selection logic doesn't actually vary by runtime - it uses the same DAYTONA_SNAPSHOT for all runtimes. This could be misleading. Consider clarifying the error message to reflect that a single snapshot is used for all runtimes.

Copilot uses AI. Check for mistakes.
Comment on lines +157 to +161
agenta_api_key = (
agenta_credentials[7:]
if agenta_credentials.startswith("ApiKey ")
else ""
)
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code extracts API key from credentials by checking if it starts with "ApiKey " and slicing from position 7, but if the credentials string is exactly "ApiKey " (with no actual key following), this would result in an empty string, which would still be added to env vars. Consider adding validation to ensure the extracted API key is non-empty.

Copilot uses AI. Check for mistakes.
Comment on lines +284 to +293
# Fallback: attempt to extract a JSON object containing "result"
for line in reversed(output_lines):
if "result" not in line:
continue
start = line.find("{")
end = line.rfind("}")
if start == -1 or end == -1 or end <= start:
continue
try:
result_obj = json.loads(line[start : end + 1])
Copy link

Copilot AI Jan 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback result parsing logic has a potential issue. The code finds the last occurrence of '}' with rfind("}"), but this could match a closing brace that isn't part of the result JSON object. For example, if the output contains nested JSON or code snippets, this could incorrectly identify a brace position. Consider using a more robust JSON extraction approach or validating that the extracted substring is actually valid JSON before attempting to parse it.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

example feature SDK size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants