New Remote Env / TS Env #703

cdreetz · 2026-01-08T03:43:34Z

Description

New experimental remote_env and ts_env.
Corresponding prime_cli PR

RemoteEnv is the base of a new pattern that enables people to create environments without having to write python. The base is an extension of SandboxEnv with a couple extra steps. Extra steps being after a rollout creates a sandbox, it will download the contents of the environments sandbox/ folder from the hub and runs the expected setup.sh that is expected to include the install commands for any required dependencies, and the final line of the script should do something like run the server in the sandbox that the user wrote in their chosen language.

Then TypeScriptEnv is an extension of RemoteEnv that includes the necessary typescript specific files that a typescript user should be familiar with, and allows them to build an environment by editing the index.ts and setup.sh

The expected user flow is:

prime env init my-ts-env --ts
cd environments/my_ts_env/sandbox/
edit setup.sh to download any dependencies
edit src/index.ts to have whatever tools and reward functions the environment needs
then prime env push so that everything including sandbox/ are on hub
uv run vf-eval my-ts-env then creates sandboxes, downloads sandbox files, runs sandbox setup.sh
rollout "orchestrator" then gets the tools and rewards from the sandbox through discoverability endpoints then uses tools and reward functions like it natively would

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Test improvement

Testing

All existing tests pass when running uv run pytest locally.
New tests have been added to cover the changes

Checklist

My code follows the style guidelines of this project as outlined in AGENTS.md
I have performed a self-review of my own code
[] I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Additional Notes

Note

Introduces a remote environment pattern and a TypeScript specialization for tool/reward discovery and execution.

Adds RemoteEnv that extends SandboxEnv to fetch environment tarballs from the Prime API, extract into upload_path, and run sandbox/setup.sh after sandbox startup (supports owner/name@version and optional API key)
Adds TypeScriptEnv that waits for a server (default :3000), discovers tools/rewards via GET /tools and GET /rewards, registers them as oai_tools and a Rubric, and routes tool calls (POST /tools/{name}) and reward scoring (POST /rewards/{name})
Introduces RemoteToolWrapper and RemoteRewardRubric to wrap remote endpoints as local tools and reward functions
Exposes RemoteEnv and TypeScriptEnv via verifiers/envs/experimental/remote_envs/__init__.py

^{Written by Cursor Bugbot for commit 305e90a. This will update automatically on new commits. Configure here.}

cursor · 2026-01-08T03:51:46Z

verifiers/envs/experimental/remote_envs/ts_env.py

+            reward_specs = await self._discover_rewards(sandbox_id)
+            self._register_rewards(reward_specs)
+
+            self._tools_discovered = True


Race condition causes duplicate tool registration

High Severity

The _tools_discovered check-then-act pattern is not concurrency-safe. When multiple rollouts run concurrently via asyncio.gather, multiple coroutines can pass the if not self._tools_discovered: check before any sets it to True. Each then calls _register_tools, which appends to self.tools and self.oai_tools, causing duplicate tool entries. Other similar patterns in this codebase use asyncio.Lock() to prevent this issue.

Additional Locations (1)

verifiers/envs/experimental/remote_envs/ts_env.py#L127-L139

cursor · 2026-01-08T03:51:46Z

verifiers/envs/experimental/remote_envs/ts_env.py

+            raise RuntimeError(f"Reward {reward_name} failed: {result.stderr}")
+
+        data = json.loads(result.stdout)
+        return float(data["score"])


Remote rewards fail because sandbox destroyed before scoring

High Severity

The RemoteRewardRubric calculates rewards by executing curl commands inside the sandbox via _call_remote_reward. However, the framework's execution order runs @vf.cleanup (which destroys the sandbox) at the end of each rollout, and scoring happens AFTER all rollouts complete. By the time reward functions are invoked during score_group, the sandbox has already been deleted. The parent class SandboxEnv provides a post_rollout hook specifically for extracting reward data before sandbox destruction, but TypeScriptEnv doesn't override it to pre-calculate rewards.

Additional Locations (1)

verifiers/envs/experimental/remote_envs/ts_env.py#L140-L143

cursor · 2026-01-08T04:36:27Z

verifiers/envs/experimental/remote_envs/remote_env.py

+    tar.extractall("{self.upload_path}")
+os.remove("/tmp/env.tar.gz")
+print("Download and extraction complete")
+"""


Unescaped string interpolation can break download script

Low Severity

The package_url and upload_path values are directly interpolated into Python code using f-strings without escaping. If either value contains quote characters (particularly double quotes), the generated Python script will have invalid syntax and fail to execute. While the default upload_path is safe and API-provided URLs typically don't contain quotes, this could cause hard-to-debug failures in edge cases.

cursor · 2026-01-08T04:36:27Z

verifiers/envs/experimental/remote_envs/remote_env.py

+        )
+
+        if result.exit_code != 0:
+            raise RuntimeError(f"Failed to download environment: {result.stderr}")


Non-vf.Error exceptions cause sandbox resource leaks

High Severity

The new code raises RuntimeError, TimeoutError, and ValueError which are not vf.Error subclasses. The rollout method in MultiTurnEnv only catches vf.Error exceptions during setup_state. When these exceptions occur after the sandbox is created (by the parent SandboxEnv.setup_state), they escape the error handling and the _cleanup handlers including destroy_sandbox are never called. This leaves orphaned sandboxes that are never deleted. Other sandbox-based environments like PythonEnv define their errors as vf.SandboxError subclasses to ensure proper cleanup.

Additional Locations (2)

verifiers/envs/experimental/remote_envs/ts_env.py#L101-L102

verifiers/envs/experimental/remote_envs/ts_env.py#L111-L112

cdreetz added 2 commits January 5, 2026 20:13

new remote_env and ts_env

4280262

fix pulling sandbox files from hub

5d44275

cdreetz mentioned this pull request Jan 8, 2026

New Remote Env / TS Env PrimeIntellect-ai/prime#287

Draft

cursor bot reviewed Jan 8, 2026

View reviewed changes

fix ts_env type stuff

305e90a

cursor bot reviewed Jan 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New Remote Env / TS Env #703

New Remote Env / TS Env #703

Uh oh!

cdreetz commented Jan 8, 2026 •

edited by cursor bot

Loading

Uh oh!

cursor bot Jan 8, 2026

Uh oh!

cursor bot Jan 8, 2026

Uh oh!

cursor bot Jan 8, 2026

Uh oh!

cursor bot Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

New Remote Env / TS Env #703

Are you sure you want to change the base?

New Remote Env / TS Env #703

Uh oh!

Conversation

cdreetz commented Jan 8, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Testing

Checklist

Additional Notes

Uh oh!

cursor bot Jan 8, 2026

Choose a reason for hiding this comment

Race condition causes duplicate tool registration

Uh oh!

cursor bot Jan 8, 2026

Choose a reason for hiding this comment

Remote rewards fail because sandbox destroyed before scoring

Uh oh!

cursor bot Jan 8, 2026

Choose a reason for hiding this comment

Unescaped string interpolation can break download script

Uh oh!

cursor bot Jan 8, 2026

Choose a reason for hiding this comment

Non-vf.Error exceptions cause sandbox resource leaks

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cdreetz commented Jan 8, 2026 •

edited by cursor bot

Loading