Skip to content

feat: improving manipulation demo#747

Merged
jmatejcz merged 37 commits intomainfrom
jm/mani-docker-demo
Jan 13, 2026
Merged

feat: improving manipulation demo#747
jmatejcz merged 37 commits intomainfrom
jm/mani-docker-demo

Conversation

@jmatejcz
Copy link
Contributor

@jmatejcz jmatejcz commented Dec 31, 2025

Purpose

Improve performance of manipulation demo

Proposed Changes

  • examples/manipulation-demo-streamlit.py now launches everything - simulation, robo stack and UI. No need for second terminal.
  • Changes to embodiment prompt to improve performance
  • Added new scene configs
  • Now all configs that are available in the demo contains only - red cubes, carrots and apples
    Yellow cubes, blue cubes, tomatoes and corn were removed because our vision model do not recognize colors so these were easy to confuse
  • Fixed a bug in connector where unique id was created during import time, which resulted in all connector nodes having the same name
  • Fixed description of arguments in MoveObjectFromToToolInput
  • Updated command in docs and added example prompts
  • Using Vacuum Gripper which performs better in our demo (switched branch in demos.repos)

Warning

If you already have rai-manipulaiton-demo cloned, VCS wont pull changes automatically. You need to either pull manually or remove the repo and clone again.

Notes and Doubts

I didn't test it extensively on rai-perception refactor. It was developed and tested mainly when rai-openset-vision existed, so the tools were slightly different. But they are very similar so i' m betting that it works fine.

in experimental setup reset arm was added straight to agent after every tool call, so the arm is not stuck in weird positions. In production we obviously can't leave it like that. Reset arm tool can be added, but it will introduce more complexity to demo.

I tried making adding launch methods to common file here a4f8495, but it seems like python package import do not work in launch files so I removed it. If you know if this can be done, please let me know

If now one streamlit file launches everything, do we need the rest of the example mani files?

Testing

Run manipulation demo, it should perform better

Summary by CodeRabbit

Release Notes

  • New Features

    • Added Streamlit-based interactive interface for the manipulation demo with predefined layout selection and dynamic scene configuration.
    • Integrated camera artifact support in chat interactions for multimodal AI responses.
    • Expanded predefined scene configurations for manipulation tasks.
  • Refactor

    • Consolidated agent creation logic into a shared module for improved maintainability.
    • Centralized scene configuration path management for easier scenario setup.
  • Documentation

    • Updated manipulation demo guide with simplified workflows and expanded usage examples.
  • Tests

    • Added validation tests for connector naming and scene clearing functionality.

✏️ Tip: You can customize this high-level summary in your review settings.

@jmatejcz jmatejcz force-pushed the jm/fix/manipulation-demo branch 3 times, most recently from d5e10ae to 7e9d854 Compare January 2, 2026 10:51
@jmatejcz jmatejcz force-pushed the jm/mani-docker-demo branch 2 times, most recently from d46cf98 to 2576943 Compare January 2, 2026 12:09
@jmatejcz jmatejcz force-pushed the jm/fix/manipulation-demo branch from 7822d45 to a054137 Compare January 3, 2026 15:49
Base automatically changed from jm/fix/manipulation-demo to main January 4, 2026 10:32
@jmatejcz jmatejcz force-pushed the jm/mani-docker-demo branch from 182be41 to 712fa7d Compare January 4, 2026 10:41
@jmatejcz jmatejcz marked this pull request as ready for review January 4, 2026 10:51
@codecov
Copy link

codecov bot commented Jan 4, 2026

Codecov Report

❌ Patch coverage is 80.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.33%. Comparing base (790f1c1) to head (7f9d456).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/rai_core/rai/tools/ros2/manipulation/custom.py 75.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #747      +/-   ##
==========================================
- Coverage   65.34%   65.33%   -0.02%     
==========================================
  Files          78       78              
  Lines        3388     3392       +4     
==========================================
+ Hits         2214     2216       +2     
- Misses       1174     1176       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jmatejcz jmatejcz requested a review from Juliaj January 4, 2026 10:58
@jmatejcz jmatejcz force-pushed the jm/mani-docker-demo branch 3 times, most recently from 0c2fd35 to 7743d6f Compare January 6, 2026 14:03
@Juliaj
Copy link
Collaborator

Juliaj commented Jan 7, 2026

I tried making adding launch methods to common file here a4f8495, but it seems like python package import do not work in launch files so I removed it.

What error were you seeing? My guess is that the launch file was run in a different Python context, so examples/ might not be on sys.path .

in the launch file, you could try following

import sys
from pathlib import Path

# Add examples directory to path
examples_path = Path(__file__).parent.parent / "examples"
sys.path.insert(0, str(examples_path))

from manipulation_common import launch_description  # or whatever function name

@jmatejcz
Copy link
Contributor Author

jmatejcz commented Jan 7, 2026

I tried making adding launch methods to common file here a4f8495, but it seems like python package import do not work in launch files so I removed it.

What error were you seeing? My guess is that the launch file was run in a different Python context, so examples/ might not be on sys.path .

in the launch file, you could try following

import sys
from pathlib import Path

# Add examples directory to path
examples_path = Path(__file__).parent.parent / "examples"
sys.path.insert(0, str(examples_path))

from manipulation_common import launch_description  # or whatever function name

I get import error. I tried these approaches with adding path, but it seems like when you use ros2 launch python modules are not recognized properly

@jmatejcz
Copy link
Contributor Author

jmatejcz commented Jan 7, 2026

PERFORMANCE TESTS

I tried simple prompts that require 1 tool call and some more advanced , for example these (which are in docs now):

  • "Place every apple on top of the cube"
  • "Stack all cubes into a tower"
  • "Arrange objects in a line"
  • "Move cubes to the left side of the table"

Problems that we encountered on old version:

  • Objects were falling from the gripper mid manipulation
  • Positions of cubes were detected wrongly , tomatoes with apples and corn was just hard to identify
  • Arm was bumping into other objects
  • Arm was not behaving properly on the ends of table
  • Agent struggled with understanding objects sizes (bad alignments and stacking, not leaving any margin)

on the above prompts demo had ~50% acc.

We found out that vision model do not recognize colors -> so I left in demo only configs with red cubes, removed tomatoes and corn.

We decided that this should be showcase of agentic Ai, not a robot capability, so we replaced real gripper with vacuum one ( which is not adequate to what you see in simulation). We removed collisions of gripper's fingers. @knicked did all that in https://github.com/RobotecAI/rai-manipulation-demo.

We changed the embodiment prompt so that agent understand the sizes of objects, leaves margins, etc. We also added the arm range in prompt and made configs so that the objects do not exceed the range.

The results:

  • Easy prompts that require 1 tool call -> almost 100% acc
  • Harder ones (these above) -> 80% , after follow app prompts from user even 95%

@jmatejcz jmatejcz force-pushed the jm/mani-docker-demo branch from 414c328 to 099f22d Compare January 12, 2026 11:02
@jmatejcz
Copy link
Contributor Author

jmatejcz commented Jan 12, 2026

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 12, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 12, 2026

Walkthrough

The PR refactors manipulation demo infrastructure by extracting agent creation logic into a reusable factory, introducing O3DE-based scene management with Streamlit UI enhancements, adding multimodal camera support to chat interactions, centralizing scene configuration management, and modifying ROS2 connector initialization to accept optional node naming.

Changes

Cohort / File(s) Summary
Demo Application Refactoring
examples/manipulation-demo-streamlit.py, examples/manipulation-demo.py, examples/manipulation_common.py
New factory function in manipulation_common creates ROS2-backed LangChain agent with camera tooling; manipulation-demo.py imports and unpacks agent tuple; manipulation-demo-streamlit.py adds O3DE scene setup, scenario selection UI, dynamic scene reloading, and multimodal HumanMessage support with camera artifacts; public function signatures changed to support scenario paths and camera tool access.
Documentation & Configuration Metadata
docs/demos/manipulation.md, demos.repos, examples/__init__.py, examples/embodiments/manipulation_embodiment.json
Manipulation demo docs reorganized to prioritize Streamlit workflow; repository version updated; license header added to examples package; embodiment JSON expanded with new rules and detailed capability descriptions.
Scene Configuration Data
src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/*.yaml
17 YAML configuration files added or updated with entity poses and rotations; common pattern includes adjusting x/y translation coordinates (often normalized to 0.3) and rotation parameters (y to 0.707, w to 0.707) for corn/vegetable entities; new configs introduce additional entity combinations.
Scenario Management
src/rai_bench/rai_bench/manipulation_o3de/predefined/scenarios.py
Refactored to use centralized SCENE_CONFIG_PATHS and SCENE_CONFIG_ALIASES dictionaries; added get_scene_configs() helper; per-level scenario functions now consume path maps instead of hard-coded lists, reducing duplication and improving maintainability.
ROS2 Framework & Tools
src/rai_core/rai/communication/ros2/connectors/base.py, src/rai_core/rai/tools/ros2/manipulation/custom.py, src/rai_core/pyproject.toml
ROS2BaseConnector.init signature modified to accept optional node_name (defaults to None with runtime generation); MoveObjectFromToToolInput field descriptions clarified to distinguish source vs. destination coordinates; core version bumped to 2.7.0.
Simulation & O3DE Infrastructure
src/rai_sim/rai_sim/o3de/o3de_bridge.py, src/rai_sim/rai_sim/simulation_bridge.py
O3DExROS2Bridge gains new clear_scene() method to despawn all entities; simulation_bridge changed Entity construction to use intermediate pose_stamped variable instead of inline creation.
Test Coverage
tests/communication/ros2/test_connectors.py, tests/rai_sim/test_o3de_bridge.py, tests/rai_sim/test_simulation_bridge.py
Added test for ROS2 connector unique node name generation; test for O3DE bridge clear_scene() despawn behavior; assertion added for pose message type prefix validation in base config loading.

Sequence Diagram(s)

sequenceDiagram
    participant User as User/Streamlit
    participant App as Demo App
    participant Agent as LLM Agent
    participant ROS2 as ROS2 Connector
    participant O3DE as O3DE Engine
    participant Camera as Camera Tool

    User->>App: Select scenario layout
    App->>O3DE: clear_scene()
    O3DE-->>App: scene cleared
    App->>O3DE: setup_new_scene(scenario_path)
    O3DE-->>App: scene initialized
    App->>Agent: initialize with camera_tool
    Agent->>ROS2: await required services/topics
    ROS2-->>Agent: services ready
    Agent-->>App: agent, camera_tool ready
    
    User->>App: Submit prompt with context
    App->>Camera: capture images
    Camera-->>App: image artifacts
    App->>Agent: HumanMultimodalMessage(text, images)
    Agent->>ROS2: query object positions/transforms
    ROS2->>O3DE: fetch entity data
    O3DE-->>ROS2: entity poses
    ROS2-->>Agent: object data
    Agent->>Agent: reason over multimodal input
    Agent-->>App: AIMessage response
    App->>User: render assistant response
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly Related PRs

Suggested Reviewers

  • Juliaj
  • boczekbartek
  • maciejmajek
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: improving manipulation demo' clearly summarizes the main change and accurately reflects the primary objective of enhancing the manipulation demo's performance.
Description check ✅ Passed The pull request description comprehensively covers the template structure with clear Purpose, Proposed Changes, and Testing sections, providing substantial detail about modifications and trade-offs.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/2a_1c_2rc.yaml (1)

1-75: Filename mismatch and corn entities should be removed per PR objectives.

Two issues found:

  1. Filename inconsistency: The filename 2a_1c_2rc.yaml implies "2 apples, 1 corn, 2 red cubes", but the file contains 2 corn entities (corn1 and corn2).

  2. Corn should be removed: According to the PR objectives, corn was removed from demo configs because "the vision model confuses colors." However, this file still contains corn1 and corn2 entities.

Please either:

  • Remove corn entities and rename the file to match its contents (e.g., 2a_2rc.yaml), or
  • Clarify if this config is intentionally excluded from the "corn removal" scope.
src/rai_core/rai/tools/ros2/manipulation/custom.py (2)

185-186: Dead code: standalone expression has no effect.

Line 186 accesses self.connector.service_call but doesn't call it or assign the result. This appears to be leftover from a refactoring attempt mentioned in the NOTE comment above.

🧹 Proposed fix
         # NOTE: create_client could be refactored into self.connector.service_call
-        self.connector.service_call
         client = self.connector.node.create_client(

240-244: Bug: f-string with literal braces won't interpolate variables.

The error message uses an f-string but the braces are escaped (double braces become literal), so x, y, z values won't be displayed - the user will see the literal text {x:.2f} instead of the actual coordinates.

🐛 Proposed fix
             self.connector.logger.error(
                 f"Failed to position end effector at coordinates ({x:.2f}, {y:.2f}, {z:.2f})."
             )
-            return "Failed to position end effector at coordinates ({x:.2f}, {y:.2f}, {z:.2f})."
+            return f"Failed to position end effector at coordinates ({x:.2f}, {y:.2f}, {z:.2f})."
src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/1carrot_1a_2t_1bc_1rc_3yc_stacked.yaml (1)

14-49: Fix collision: apple1 and tomato1 occupy identical positions.

At lines 14–49, apple1 and tomato1 are both placed at (x: 0.3, y: -0.10, z: 0.05). This will cause them to overlap in simulation. Additionally, tomato2 at y: -0.15 is only 0.05 units away, creating a very crowded region. Reposition at least one of these entities to avoid collisions.

examples/manipulation-demo.py (1)

12-12: Minor: Typo in license header.

The license text contains "goveself.rning" which should be "governing".

Suggested fix
-# See the License for the specific language goveself.rning permissions and
+# See the License for the specific language governing permissions and
src/rai_bench/rai_bench/manipulation_o3de/predefined/scenarios.py (1)

119-126: Remove ["tomato"] from object_groups in trivial_scenarios or verify intentional inclusion.

The object_groups list includes ["tomato"] (line 121), but the trivial level's place_obj_types (lines 99-103) does not include tomato. Since all 5 trivial configs are passed to create_scenarios(), the MoveObjectsToLeftTask for tomato will be applied to configs that lack tomato (only 1 of 5 configs contains it), resulting in unreachable scenarios. Either remove tomato from object_groups to align with place_obj_types, or confirm this is intentional and document why. Note: the easy_scenarios function includes tomato in its task objects, suggesting this may be an accidental inclusion at the trivial level.

🤖 Fix all issues with AI agents
In @docs/demos/manipulation.md:
- Around line 59-61: Fix the typo in the sentence "how to use the RAI API for
you own applications:" by replacing "you own" with "your own" so it reads "how
to use the RAI API for your own applications:"; update the markdown line
accordingly.

In @examples/manipulation_common.py:
- Line 12: The license header contains a typo "goveself.rning" in the comment
string; update that header comment in examples/manipulation_common.py to replace
"goveself.rning" with the correct word "governing" so the license line reads
"... See the License for the specific language governing permissions and ...".

In @examples/manipulation-demo-streamlit.py:
- Line 12: Replace the misspelled word in the license header: locate the comment
containing "See the License for the specific language goveself.rning permissions
and" and correct "goveself.rning" to "governing" so the line reads "See the
License for the specific language governing permissions and".
- Line 93: The call to ROS2Connector is passing "multi_threaded" as a positional
argument which will become node_name; update the call that constructs
O3DEngineArmManipulationBridge so ROS2Connector receives executor_type
explicitly (e.g., ROS2Connector(executor_type="multi_threaded")) or provide
node_name first and executor_type as keyword; adjust the line with
O3DEngineArmManipulationBridge(ROS2Connector("multi_threaded")) to use the
correct keyword argument for executor_type.
- Around line 257-259: The st.sidebar.success call after st.rerun() is
unreachable because st.rerun() restarts the app immediately; remove the
st.sidebar.success(...) line (the call that references
st.session_state['current_layout']) or move it to execute before calling
st.rerun() if you need the message shown, ensuring no code remains after
st.rerun() inside the same block.

In
@src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/1a_1t_1bc_2corn.yaml:
- Around line 2-3: The entity declaration has a mismatched name and prefab: the
entity is named "apple1" but uses prefab_name "tomato"; fix by either renaming
the entity to match the spawned prefab (e.g., change the entity name from apple1
to tomato2 or similar) or changing prefab_name from "tomato" to the correct
prefab "apple" so the entity name and prefab_name (apple1 ↔ prefab_name) are
consistent; update the single occurrence in the config so the entity name and
prefab_name align.

In
@src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/1carrot_3a_1bc_1rc_3yc_stacked.yaml:
- Line 1: The filename indicates three apples (3a) but the YAML only defines
apple1; either rename the file to match the actual content (e.g., change
1carrot_3a_1bc_1rc_3yc_stacked.yaml to 1carrot_1a_1bc_1rc_3yc_2t_stacked.yaml)
or update the YAML by adding the missing apple entities (apple2, apple3) to
match the 3a token in the filename; ensure the filename and the top-level
entities list (e.g., apple1, apple2, apple3) stay consistent.
- Around line 14-37: apple1 and tomato1 currently share identical translation
coordinates causing a physical overlap; update the tomato1 pose translation (the
translation block under the tomato1 entry) to a non-colliding position by
changing x, y and/or z (e.g., offset x or y by a small amount or raise z) so
tomato1 no longer occupies the same space as apple1 while keeping the intended
scene layout.
🧹 Nitpick comments (5)
examples/embodiments/manipulation_embodiment.json (1)

3-11: Good additions for operational guidance.

The expanded rules provide clear constraints for the agent, including stacking height calculations, margin requirements, and color detection limitations. These align well with the PR objectives to improve manipulation demo performance.

Minor consistency nits:

  • Line 6: Consider "Do not put objects on top of other objects" for grammatical consistency
  • Line 10: Capitalize "Detection tool" for consistency with other rules
Optional: Minor grammar/formatting tweaks
-        "Remember to not put objects on top of other objects, unless specifically instructed otherwise",
+        "Do not put objects on top of other objects unless specifically instructed otherwise",
         ...
-        "detection tool can't recognize colors, so just ask for categories",
+        "Detection tool can't recognize colors, so just ask for categories",
examples/manipulation_common.py (1)

33-69: Consider adding a docstring for create_agent().

The function lacks documentation. Adding a brief docstring would clarify its purpose, return type, and any prerequisites (e.g., required ROS2 services must be running).

Suggested docstring
 def create_agent():
+    """Create a ROS2-backed manipulation agent with perception tools.
+
+    Waits for required ROS2 services and topics before returning.
+
+    Returns
+    -------
+    tuple[CompiledGraph, GetROS2ImageConfiguredTool]
+        The agent runnable and the camera tool for image capture.
+    """
     connector = ROS2Connector(executor_type="single_threaded")
examples/manipulation-demo-streamlit.py (3)

193-198: Avoid using private method _despawn_entity.

Directly calling _despawn_entity (prefixed with _) couples this code to internal implementation details of the bridge. Consider using the public clear_scene() method consistently, or request a public API if the behavior differs.

Suggested approach
-                try:
-                    # Actually clear the scene by despawning all entities
-                    while st.session_state["o3de"].spawned_entities:
-                        st.session_state["o3de"]._despawn_entity(
-                            st.session_state["o3de"].spawned_entities[0]
-                        )
-                except Exception as clear_error:
-                    st.sidebar.warning(f"Could not clear scene: {str(clear_error)}")
+                try:
+                    st.session_state["o3de"].clear_scene()
+                except Exception as clear_error:
+                    st.sidebar.warning(f"Could not clear scene: {clear_error!r}")

224-228: Duplicate logic: Use get_scenario_path helper instead.

This loop duplicates the logic in get_scenario_path() defined at lines 126-132. Reuse the helper for consistency.

Suggested fix
             # Find the scenario path for the current layout
-            current_scenario_path = None
-            for s in scenarios:
-                if Path(s.scene_config_path).stem == st.session_state["current_layout"]:
-                    current_scenario_path = s.scene_config_path
-                    break
+            current_scenario_path = get_scenario_path(
+                scenarios, st.session_state["current_layout"]
+            )

199-200: Optional: Use f-string conversion flags for exceptions.

Per static analysis, replace str(error) with {error!s} or {error!r} for cleaner exception formatting.

Example fix
-                    st.sidebar.warning(f"Could not clear scene: {str(clear_error)}")
+                    st.sidebar.warning(f"Could not clear scene: {clear_error!s}")

Also applies to: 214-215, 237-238, 260-261

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 790f1c1 and 099f22d.

📒 Files selected for processing (34)
  • demos.repos
  • docs/demos/manipulation.md
  • examples/__init__.py
  • examples/embodiments/manipulation_embodiment.json
  • examples/manipulation-demo-streamlit.py
  • examples/manipulation-demo.py
  • examples/manipulation_common.py
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/1a_1t_1bc_2corn.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/1bc_1rc_1yc.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/1carrot_1a_2t_1bc_1rc_3yc_stacked.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/1carrot_1corn.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/1carrot_3a_1bc_1rc_3yc_stacked.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/1rc_2bc_3yc.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/2a_1c_2rc.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/2carrots_1a_1t_1bc_1rc_1yc_1corn.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/2carrots_2a.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/2rc_2a.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/2t_3a_1corn_2rc.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/3a_2corn_2rc.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/3a_4corn_2bc.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/3a_4corn_2rc.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/3carrots_1a_2bc_1rc_1yc_1corn.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/3carrots_3a_2rc.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/3rc.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/3rc_2a_1carrot.yaml
  • src/rai_bench/rai_bench/manipulation_o3de/predefined/scenarios.py
  • src/rai_core/pyproject.toml
  • src/rai_core/rai/communication/ros2/connectors/base.py
  • src/rai_core/rai/tools/ros2/manipulation/custom.py
  • src/rai_sim/rai_sim/o3de/o3de_bridge.py
  • src/rai_sim/rai_sim/simulation_bridge.py
  • tests/communication/ros2/test_connectors.py
  • tests/rai_sim/test_o3de_bridge.py
  • tests/rai_sim/test_simulation_bridge.py
🧰 Additional context used
🧠 Learnings (6)
📚 Learning: 2025-12-22T18:09:29.833Z
Learnt from: maciejmajek
Repo: RobotecAI/rai PR: 731
File: tests/pkg_publish/__init__.py:1-1
Timestamp: 2025-12-22T18:09:29.833Z
Learning: In the RobotecAI/rai repository (Apache-2.0 licensed), individual contributors are allowed to assign copyright to themselves in file headers rather than using the organization's standard "Copyright (C) 2024 Robotec.AI" format. Both individual attribution (e.g., "Copyright (C) 2025 Julia Jia") and organization attribution are acceptable.

Applied to files:

  • examples/__init__.py
📚 Learning: 2025-02-10T21:35:25.819Z
Learnt from: maciejmajek
Repo: RobotecAI/rai PR: 416
File: src/rai_core/rai/communication/ros2/connectors.py:241-267
Timestamp: 2025-02-10T21:35:25.819Z
Learning: Avoid circular imports in the ROS2 connector modules by placing utility functions in dedicated utility modules that don't import from the connector modules.

Applied to files:

  • tests/communication/ros2/test_connectors.py
  • examples/manipulation_common.py
  • src/rai_core/rai/communication/ros2/connectors/base.py
📚 Learning: 2025-01-21T11:24:27.687Z
Learnt from: maciejmajek
Repo: RobotecAI/rai PR: 379
File: tests/communication/ros2/test_connectors.py:50-50
Timestamp: 2025-01-21T11:24:27.687Z
Learning: In ROS2 connector implementation, topics are typically static and well-discovered before running any method of the API. The discovery mechanism in tests is an edge case, as it's not representative of the typical usage pattern.

Applied to files:

  • tests/communication/ros2/test_connectors.py
📚 Learning: 2024-11-28T14:37:22.619Z
Learnt from: boczekbartek
Repo: RobotecAI/rai PR: 324
File: examples/rosbot-xl.launch.py:40-44
Timestamp: 2024-11-28T14:37:22.619Z
Learning: In the `rosbot-xl.launch.py` and `turtlebot.launch.py` launch files, the `run-nav.bash` scripts are different for each demo and should remain as separate calls to maintain clarity in the demo launches.

Applied to files:

  • docs/demos/manipulation.md
📚 Learning: 2025-01-25T14:33:56.413Z
Learnt from: maciejmajek
Repo: RobotecAI/rai PR: 386
File: src/rai/rai/tools/ros/tools.py:30-30
Timestamp: 2025-01-25T14:33:56.413Z
Learning: In the RAI package, ROS2-related Python packages (e.g., tf_transformations) are currently installed via rosdep rather than being declared in pyproject.toml.

Applied to files:

  • src/rai_core/pyproject.toml
📚 Learning: 2025-02-10T21:36:05.397Z
Learnt from: maciejmajek
Repo: RobotecAI/rai PR: 416
File: src/rai_core/rai/communication/ros2/connectors.py:273-294
Timestamp: 2025-02-10T21:36:05.397Z
Learning: In ROS2 connectors, while using mutable default arguments (like empty lists) in class constructors is generally safe as they're only used for initialization and each instance gets its own copy through list comprehensions, it's still recommended to follow the best practice of using None as default and initializing within the constructor.

Applied to files:

  • src/rai_core/rai/communication/ros2/connectors/base.py
🧬 Code graph analysis (7)
tests/rai_sim/test_o3de_bridge.py (2)
src/rai_sim/rai_sim/simulation_bridge.py (2)
  • SpawnedEntity (54-61)
  • _despawn_entity (233-251)
src/rai_sim/rai_sim/o3de/o3de_bridge.py (2)
  • _despawn_entity (187-200)
  • clear_scene (301-303)
examples/manipulation-demo.py (1)
examples/manipulation_common.py (1)
  • create_agent (33-69)
src/rai_sim/rai_sim/o3de/o3de_bridge.py (2)
src/rai_sim/rai_sim/simulation_bridge.py (3)
  • _despawn_entity (233-251)
  • setup_scene (195-209)
  • SceneConfig (73-151)
tests/rai_sim/test_simulation_bridge.py (2)
  • _despawn_entity (173-175)
  • setup_scene (158-161)
src/rai_sim/rai_sim/simulation_bridge.py (3)
src/rai_core/rai/types/geometry.py (4)
  • PoseStamped (43-45)
  • Pose (38-40)
  • Point (25-28)
  • Quaternion (31-35)
src/rai_core/rai/types/std.py (1)
  • Header (29-31)
tests/rai_sim/test_simulation_bridge.py (1)
  • pose (81-87)
src/rai_bench/rai_bench/manipulation_o3de/predefined/scenarios.py (3)
src/rai_sim/rai_sim/simulation_bridge.py (2)
  • SceneConfig (73-151)
  • load_base_config (113-151)
src/rai_core/rai/types/geometry.py (1)
  • Path (59-62)
src/rai_bench/rai_bench/manipulation_o3de/benchmark.py (1)
  • Scenario (71-111)
examples/manipulation_common.py (8)
src/rai_core/rai/initialization/model_initialization.py (1)
  • get_llm_model (128-161)
src/rai_core/rai/agents/langchain/core/react_agent.py (1)
  • create_react_runnable (88-133)
src/rai_core/rai/communication/ros2/waiters.py (2)
  • wait_for_ros2_services (66-107)
  • wait_for_ros2_topics (110-149)
src/rai_core/rai/communication/ros2/connectors/ros2_connector.py (1)
  • ROS2Connector (19-20)
src/rai_core/rai/tools/ros2/manipulation/custom.py (2)
  • GetObjectPositionsTool (274-322)
  • MoveObjectFromToTool (151-265)
src/rai_core/rai/tools/ros2/simple.py (1)
  • GetROS2ImageConfiguredTool (33-48)
src/rai_extensions/rai_perception/rai_perception/tools/segmentation_tools.py (1)
  • GetGrabbingPointTool (197-372)
src/rai_whoami/rai_whoami/models/models.py (2)
  • EmbodimentInfo (155-237)
  • from_file (163-166)
examples/manipulation-demo-streamlit.py (6)
examples/manipulation_common.py (1)
  • create_agent (33-69)
src/rai_core/rai/agents/integrations/streamlit.py (1)
  • get_streamlit_cb (33-187)
src/rai_core/rai/communication/ros2/connectors/ros2_connector.py (1)
  • ROS2Connector (19-20)
src/rai_core/rai/messages/multimodal.py (1)
  • HumanMultimodalMessage (67-88)
src/rai_bench/rai_bench/manipulation_o3de/predefined/scenarios.py (1)
  • get_scenarios (434-459)
src/rai_bench/rai_bench/manipulation_o3de/benchmark.py (1)
  • Scenario (71-111)
🪛 LanguageTool
docs/demos/manipulation.md

[grammar] ~60-~60: Ensure spelling is correct
Context: ...ample of how to use the RAI API for you own applications: 1. Run Simulatio...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🪛 markdownlint-cli2 (0.18.1)
docs/demos/manipulation.md

55-55: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)


64-64: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)


70-70: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)


76-76: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


76-76: Code block style
Expected: indented; Actual: fenced

(MD046, code-block-style)

🪛 Ruff (0.14.10)
tests/communication/ros2/test_connectors.py

350-350: Unused function argument: ros_setup

(ARG001)

examples/manipulation-demo-streamlit.py

199-199: Do not catch blind exception: Exception

(BLE001)


200-200: Use explicit conversion flag

Replace with conversion flag

(RUF010)


214-214: Do not catch blind exception: Exception

(BLE001)


215-215: Use explicit conversion flag

Replace with conversion flag

(RUF010)


237-237: Do not catch blind exception: Exception

(BLE001)


238-238: Use explicit conversion flag

Replace with conversion flag

(RUF010)


260-260: Do not catch blind exception: Exception

(BLE001)


261-261: Use explicit conversion flag

Replace with conversion flag

(RUF010)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-and-test-ros2 (jazzy)
  • GitHub Check: build-and-test-ros2 (humble)
🔇 Additional comments (36)
src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/3rc_2a_1carrot.yaml (1)

1-75: LGTM! Scene configuration is well-structured and follows established patterns.

The entity definitions are consistent, quaternions are valid (identity rotation), and the file aligns with the PR objective of using only red cubes, apples, and carrots. The filename accurately reflects the contents (3 red cubes, 2 apples, 1 carrot).

Minor nit: There are inconsistent blank lines between entity groups (lines 38 and 63 have extra spacing while others don't). Consider standardizing for consistency across config files.

src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/2t_3a_1corn_2rc.yaml (1)

69-73: LGTM!

The quaternion rotation change for corn1 is mathematically valid (magnitude = 1) and represents a 90° rotation around the Y-axis, orienting the corn horizontally. This is consistent with similar rotation adjustments made to corn entities in other benchmark configs.

src/rai_core/pyproject.toml (1)

7-7: LGTM!

Minor version bump from 2.6.2 to 2.7.0 is appropriate for the new manipulation demo features and improvements introduced in this PR.

demos.repos (1)

9-9: Verify the stability of the feature branch reference.

Referencing a feature branch (kd/vacuum_gripper) instead of development or a tagged release may be fragile—feature branches can be deleted or merged without notice. Consider whether this branch will be merged to development or if a stable tag/release should be created for long-term reference.

Also noted from PR discussion: existing clones of rai-manipulation-demo won't auto-update via VCS import; users must manually pull or reclone.

src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/2carrots_1a_1t_1bc_1rc_1yc_1corn.yaml (1)

74-85: Rotation change looks correct, but verify config alignment with PR objectives.

The quaternion (0, 0.707, 0, 0.707) represents a valid 90° rotation around the Y-axis, which standardizes corn orientation.

However, PR objectives state that corn, tomatoes, and yellow/blue cubes were removed from demo configs because the vision model confuses colors. This config still contains all of those objects. Please verify whether this config is:

  1. A legacy config not intended for demo use, or
  2. Should be updated to remove the problematic objects
examples/__init__.py (1)

1-13: LGTM!

Adding __init__.py correctly establishes examples/ as a Python package, enabling imports between example modules (e.g., manipulation_common.py). License header follows the project's standard format.

src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/1carrot_1corn.yaml (1)

21-25: Rotation standardization looks good.

The quaternion change to (0, 0.707, 0, 0.707) correctly applies a 90° Y-axis rotation, consistent with the standardization applied to corn orientations in other config files.

Same note as the other config: this file contains corn which PR objectives indicate should be removed from demo configs due to vision model confusion. Please confirm whether this config is still in active use for the demo.

src/rai_core/rai/tools/ros2/manipulation/custom.py (1)

130-148: LGTM!

The updated field descriptions now clearly distinguish between source (pick-up) and destination (drop-off) coordinates, which will help the LLM understand the tool's parameters correctly.

src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/3rc.yaml (1)

1-37: LGTM!

Scene configuration with three red cubes is well-structured and aligns with the PR objective to use objects the vision model can reliably detect.

src/rai_sim/rai_sim/simulation_bridge.py (1)

130-149: LGTM!

The explicit loop construction improves readability and properly creates PoseStamped objects with the expected ROS2BaseModel inheritance. The fallback to an empty dict for missing rotation correctly defaults to identity quaternion.

Note: The TODO on line 138 about adapting YAML configs to ROS2 naming convention (translationposition) is worth tracking for future cleanup.

src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/3a_2corn_2rc.yaml (1)

41-65: Config includes corn entities, contradicting PR objectives.

The PR description states that corn was removed because "the vision model confuses colors." However, this configuration includes corn1 and corn2 entities.

Please verify whether:

  1. This config is intentionally kept for benchmarking purposes (not demo use), or
  2. The corn entities should be removed to align with the stated objectives.
docs/demos/manipulation.md (1)

53-82: Documentation improvements look good.

The updated instructions with Streamlit as the primary launch method and the expanded example prompts align well with the PR objectives.

src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/3a_4corn_2bc.yaml (2)

40-87: LGTM! Corn entity rotations and positions look correct.

The quaternion values (0, 0.707, 0, 0.707) represent a valid, normalized 90° rotation around the Y-axis. Position adjustments align with PR objectives to keep objects within arm reach.


88-111: Blue cube position adjustments look reasonable.

The changes move blue_cube1 and blue_cube2 to new positions within the workspace. These adjustments appear consistent with the PR goal of keeping objects within arm range.

src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/3carrots_1a_2bc_1rc_1yc_1corn.yaml (1)

89-100: LGTM! Rotation normalization is consistent with other corn entities.

The quaternion update to (0, 0.707, 0, 0.707) matches the rotation pattern applied to corn entities across other config files in this PR.

Note: corn1 at x: 0.6 may be at the edge of the arm's effective range — verify this position is reliably reachable based on the embodiment constraints mentioned in the PR.

tests/rai_sim/test_simulation_bridge.py (1)

142-149: Good addition to validate pose message type.

The assertion ensures that entities loaded from base configs have properly typed poses with the geometry_msgs/msg prefix, which is important for ROS integration.

Minor note: The test accesses the protected attribute _prefix. This is acceptable for test code verifying internal state, but if a public method exists to check the message type, consider using that for better encapsulation.

src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/1rc_2bc_3yc.yaml (1)

14-37: Position adjustments look good.

The coordinate changes for cube2 and cube3 appear intentional for workspace optimization. No overlapping positions detected among entities.

src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/1a_1t_1bc_2corn.yaml (1)

26-62: Verify: Config includes entities marked for removal in PR objectives.

The PR objectives state that blue/yellow cubes, tomatoes, and corn were removed because the vision model confuses colors. This config still includes blue_cube1, tomato1, corn1, and corn2.

Since this file is under rai_bench/, please confirm whether benchmark configs are intentionally kept separate from demo configs, or if this file should also be updated to align with the new object restrictions.

src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/1bc_1rc_1yc.yaml (1)

1-37: Verify: Config includes blue and yellow cubes marked for removal.

Similar to the previous config, this file contains blue_cube1 and yellow_cube1 which the PR objectives indicate were removed due to vision model color confusion. Please confirm whether benchmark configs should follow the same restrictions as demo configs.

The position adjustments (x: 0.3) align with the arm's reachable range specified in the embodiment config.

examples/embodiments/manipulation_embodiment.json (1)

19-19: Clear coordinate system specification.

The explicit axis descriptions with reachable ranges will help the agent plan movements within safe boundaries. The narrow x-range [0.3, 0.5] m is appropriate for the demo table setup.

src/rai_sim/rai_sim/o3de/o3de_bridge.py (2)

301-303: LGTM! Good extraction of scene clearing logic.

The clear_scene() method correctly handles despawning by repeatedly removing the first entity until the list is empty. Since _despawn_entity removes the entity from spawned_entities (line 196), the while loop will terminate correctly. An empty list is handled safely as the loop simply won't execute.


305-313: Clean refactor of setup_scene.

Delegating to clear_scene() improves readability and allows external callers to clear the scene independently if needed.

src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/3carrots_3a_2rc.yaml (1)

1-99: LGTM! Well-structured scene config aligned with PR objectives.

This new configuration correctly uses only the approved object types (carrots, apples, red cubes) as specified in the PR objectives. All entity positions fall within the arm's reachable ranges defined in the embodiment config:

  • x: [0.30, 0.50] ⊂ [0.3, 0.5] ✓
  • y: [-0.45, 0.45] ⊂ [-0.5, 0.5] ✓
src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/2rc_2a.yaml (1)

1-50: LGTM!

The scene configuration correctly defines 2 red cubes and 2 apples with appropriate naming conventions, valid quaternion rotations, and positions that appear to be within the arm's reachable range. This aligns well with the PR objective to use only objects the vision model can reliably identify.

tests/rai_sim/test_o3de_bridge.py (1)

276-299: LGTM!

The test properly validates the clear_scene method by:

  1. Setting up multiple spawned entities
  2. Mocking _despawn_entity with a side effect that mimics the real removal behavior
  3. Verifying the list is empty after clearing
  4. Confirming the despawn method was called for each entity

This correctly tests the new public clear_scene method against the expected behavior shown in the relevant code snippets.

tests/communication/ros2/test_connectors.py (1)

350-361: LGTM!

This test validates the bug fix mentioned in the PR objectives where connector nodes previously shared the same name due to a unique ID being created at import time. The test correctly:

  1. Creates two separate connectors
  2. Verifies their node names are distinct
  3. Confirms the expected naming prefix
  4. Properly cleans up both connectors in the finally block

Regarding the static analysis hint about unused ros_setup: this fixture is intentionally used for its side effects (ROS2 initialization), consistent with other tests in this file.

src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/1carrot_3a_1bc_1rc_3yc_stacked.yaml (1)

1-110: Config contains objects that should be removed per PR objectives.

According to the PR description, "all demo configs now contain only red cubes, carrots and apples. Yellow/blue cubes, tomatoes and corn were removed because the vision model confuses colors." However, this config includes:

  • 2 tomatoes (lines 26-49)
  • 1 blue cube (lines 50-61)
  • 3 yellow cubes (lines 74-110)

Please verify if this config is intentionally retained for benchmarking purposes (separate from demo configs), or if it should be updated to align with the PR objectives.

src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/2carrots_2a.yaml (1)

14-50: LGTM!

The changes correctly rename corn entities to apples, aligning with the PR objective to remove objects that confuse the vision model. The updated positions appear reasonable and the filename now accurately reflects the contents (2 carrots, 2 apples).

src/rai_bench/rai_bench/manipulation_o3de/predefined/configs/3a_4corn_2rc.yaml (2)

40-87: Verify: File contains corn entities despite PR objectives stating corn was removed.

The PR description states that "all demo configs now contain only red cubes, carrots and apples" and that "corn were removed because the vision model confuses colors." However, this new config file contains 4 corn entities. Please verify this is intentional for benchmark purposes only (not demo use) or update as needed.


1-112: LGTM on file structure and entity definitions.

The YAML structure is correct with proper quaternion rotations and entity positions within reasonable arm range.

src/rai_core/rai/communication/ros2/connectors/base.py (1)

103-126: Good fix for the import-time node naming bug.

Moving the UUID generation from module import time to instance initialization time correctly ensures each ROS2BaseConnector instance receives a unique node name. The 12-character UUID suffix provides adequate uniqueness for typical use cases.

examples/manipulation-demo.py (1)

20-26: LGTM on refactoring to use shared agent factory.

The extraction of agent creation logic to manipulation_common.py reduces duplication. Discarding the camera_tool with _ is appropriate since this CLI demo doesn't need multimodal camera integration.

examples/manipulation_common.py (1)

60-62: Verify: Relative path may fail depending on working directory.

The path "examples/embodiments/manipulation_embodiment.json" is relative to the repository root. If the script is executed from a different working directory, this will fail. Consider using Path(__file__).parent to construct the path relative to the module location.

Suggested fix using module-relative path
+from pathlib import Path
+
+EMBODIMENT_PATH = Path(__file__).parent / "embodiments" / "manipulation_embodiment.json"
+
 def create_agent():
     ...
     embodiment_info = EmbodimentInfo.from_file(
-        "examples/embodiments/manipulation_embodiment.json"
+        str(EMBODIMENT_PATH)
     )
src/rai_bench/rai_bench/manipulation_o3de/predefined/scenarios.py (2)

32-92: Good refactoring: Centralized scene configuration management.

The introduction of SCENE_CONFIG_PATHS, SCENE_CONFIG_ALIASES, and get_scene_configs() helper reduces duplication and makes the scenario generation more maintainable.


73-73: Verify: 2rc.yaml appears in both "trivial" and "hard" difficulty levels.

2rc.yaml is listed in both SCENE_CONFIG_PATHS["trivial"] (line 37) and SCENE_CONFIG_PATHS["hard"] (line 73). This seems inconsistent—a 2 red cube scene being both the easiest and among the hardest configurations. Please verify this is intentional.

examples/manipulation-demo-streamlit.py (1)

299-305: Verify: camera_tool._run() return value handling.

The code calls camera_tool._run() and expects it to return (_, artifact) where artifact is a dict with an "images" key. If artifact is None or doesn't have "images", artifact.get("images", []) will fail with AttributeError. Also, _run is a private method—consider if there's a public interface.

Defensive handling
     if prompt:
         if "camera_tool" in st.session_state:
             _, artifact = st.session_state["camera_tool"]._run()
+            images = artifact.get("images", []) if artifact else []
         else:
-            artifact = None
+            images = []
         message = HumanMultimodalMessage(
-            content=prompt, images=artifact.get("images", [])
+            content=prompt, images=images
         )

@jmatejcz jmatejcz force-pushed the jm/mani-docker-demo branch from b4de50c to e7dd28a Compare January 12, 2026 12:41
@Cytrus14
Copy link
Contributor

@jmatejcz
I've noticed that the manipulation demo will not work without installing the simbench dependency group (poetry install --with simbench) due missing inflect module:

Traceback (most recent call last):
  File "/home/user/rai/examples/manipulation-demo-streamlit.py", line 31, in <module>
    from rai_bench.manipulation_o3de import get_scenarios
  File "/home/user/rai/src/rai_bench/rai_bench/__init__.py", line 14, in <module>
    from .test_models import (
  File "/home/user/rai/src/rai_bench/rai_bench/test_models.py", line 23, in <module>
    import rai_bench.tool_calling_agent as tool_calling_agent
  File "/home/user/rai/src/rai_bench/rai_bench/tool_calling_agent/__init__.py", line 16, in <module>
    from .predefined.tasks import get_tasks
  File "/home/user/rai/src/rai_bench/rai_bench/tool_calling_agent/predefined/__init__.py", line 17, in <module>
    from .manipulation_tasks import get_manipulation_tasks
  File "/home/user/rai/src/rai_bench/rai_bench/tool_calling_agent/predefined/manipulation_tasks.py", line 24, in <module>
    from rai_bench.tool_calling_agent.tasks.manipulation import (
  File "/home/user/rai/src/rai_bench/rai_bench/tool_calling_agent/tasks/manipulation.py", line 18, in <module>
    import inflect
ModuleNotFoundError: No module named 'inflect'

@jmatejcz
Copy link
Contributor Author

@jmatejcz I've noticed that the manipulation demo will not work without installing the simbench dependency group (poetry install --with simbench) due missing inflect module:

Traceback (most recent call last):
  File "/home/user/rai/examples/manipulation-demo-streamlit.py", line 31, in <module>
    from rai_bench.manipulation_o3de import get_scenarios
  File "/home/user/rai/src/rai_bench/rai_bench/__init__.py", line 14, in <module>
    from .test_models import (
  File "/home/user/rai/src/rai_bench/rai_bench/test_models.py", line 23, in <module>
    import rai_bench.tool_calling_agent as tool_calling_agent
  File "/home/user/rai/src/rai_bench/rai_bench/tool_calling_agent/__init__.py", line 16, in <module>
    from .predefined.tasks import get_tasks
  File "/home/user/rai/src/rai_bench/rai_bench/tool_calling_agent/predefined/__init__.py", line 17, in <module>
    from .manipulation_tasks import get_manipulation_tasks
  File "/home/user/rai/src/rai_bench/rai_bench/tool_calling_agent/predefined/manipulation_tasks.py", line 24, in <module>
    from rai_bench.tool_calling_agent.tasks.manipulation import (
  File "/home/user/rai/src/rai_bench/rai_bench/tool_calling_agent/tasks/manipulation.py", line 18, in <module>
    import inflect
ModuleNotFoundError: No module named 'inflect'

I know, i noticed it and the commit is already present on the next PR #748

@jmatejcz jmatejcz requested a review from Cytrus14 January 12, 2026 13:07
@Cytrus14
Copy link
Contributor

Based on my tests with gpt-4o the manipulator demo does indeed perform better. However, there are still some shortcomings which we should keep in mind:

  • The agent does not see objects obscured by the manipulator arm. This is expected, but may nonetheless lead to issues.
  • Sometimes the arm will knock down other objects when moving.
  • If the position of an object changes during a task (e.g. it is knocked-down by the arm), the agent will use the outdated object location.

Additionally, I believe that we should add a tool for resting the position of the arm to the agent. This will allow the agent to move the arm and allow it to see the obscured objects.

@jmatejcz
Copy link
Contributor Author

Based on my tests with gpt-4o the manipulator demo does indeed perform better. However, there are still some shortcomings which we should keep in mind:

* The agent does not see objects obscured by the manipulator arm. This is expected, but may nonetheless lead to issues.

* Sometimes the arm will knock down other objects when moving.

* If the position of an object changes during a task (e.g. it is knocked-down by the arm), the agent will use the outdated object location.

Additionally, I believe that we should add a tool for resting the position of the arm to the agent. This will allow the agent to move the arm and allow it to see the obscured objects.

I added resetting arm tool
and added automatic reset after each move

@jmatejcz jmatejcz requested a review from Juliaj January 12, 2026 16:26
Copy link
Collaborator

@Juliaj Juliaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@jmatejcz jmatejcz merged commit 65b80ba into main Jan 13, 2026
9 checks passed
@jmatejcz jmatejcz deleted the jm/mani-docker-demo branch January 13, 2026 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants