Standards for building agents, better
-
Updated
Dec 15, 2025 - TypeScript
Standards for building agents, better
Agentic testing for agentic codebases
Ship agents you can audit.
Agent testing automation ๐ค by simulating users ๐ฅ and agents ๐ค with judge โ๏ธ(langwatch-scenario)
Qualitative benchmark suite for evaluating AI coding agents and orchestration paradigms on realistic, complex development tasks
๐ ๐๐ถ๐ญ๐ต๐ช-๐๐จ๐ฆ๐ฏ๐ต ๐๐บ๐ด๐ต๐ฆ๐ฎ ๐ง๐ฐ๐ณ ๐๐ณ๐ฐ๐ด๐ด-๐๐ฉ๐ฆ๐ค๐ฌ๐ช๐ฏ๐จ ๐๐ฉ๐ช๐ด๐ฉ๐ช๐ฏ๐จ ๐๐๐๐ด.
The Logic Firewall for AI Agents. Prevent infinite loops, token bombing, critical vulnerabilities and more before deployment.
Add a description, image, and links to the agent-testing topic page so that developers can more easily learn about it.
To associate your repository with the agent-testing topic, visit your repo's landing page and select "manage topics."