feat: RAI Semantic Map Memory (rai_semap) by Juliaj · Pull Request #727 · RobotecAI/rai

Juliaj · 2025-12-03T03:09:15Z

Purpose

Supports #225, Design SLAM RAI features

This PR introduces rai_semap, a semantic map memory system that stores object detections with their 3D locations in the SLAM map frame. The system enables spatial-semantic queries like "where did I see a cup?" and provides persistent memory for robot exploration and task planning.

Proposed Changes

design doc & readme.
This is an early-stage implementation. The main goal is to gather feedback on the design and approach before further development. I'm also looking for ideas to deal with following challenges,
- Deduplication: One of the main challenges is handling multiple detections of the same physical object. The system uses spatial merging with class-specific thresholds, point cloud-based matching when available, and confidence filtering to merge duplicate detections. However, distinguishing between a moved object and a new object instance remains a challenge that needs further refinement.
- Object Detection Accuracy: The accuracy of object detection directly impacts the quality of the semantic map. The system includes confidence thresholds and bounding box size filtering, but detection accuracy depends on GroundingDINO.

Testing

New unit tests have been added.
The current design has been validated with the rosbot-xl demo.

codecov · 2025-12-03T03:16:01Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 65.32%. Comparing base (1567ab1) to head (d4c85be).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #727   +/-   ##
=======================================
  Coverage   65.32%   65.32%           
=======================================
  Files          78       78           
  Lines        3386     3386           
=======================================
  Hits         2212     2212           
  Misses       1174     1174

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/rai_semap/pyproject.toml

src/rai_semap/README.md

src/rai_semap/design.md

src/rai_semap/rai_semap/ros2/detection_publisher.py

src/rai_semap/design.md

maciejmajek

After pyproject.toml changes I was able to run the semap feature. Amazing work!

The design.md mentions various algorithms for spatial/temporal deduplication. Before I reflect on that, I need to read some literature on this topic.

Since this PR is huge, I will continue the review tomorrow.

Co-authored-by: Maciej Majek <46171033+maciejmajek@users.noreply.github.com>

maciejmajek · 2025-12-09T12:49:43Z

src/rai_semap/rai_semap/core/backend/spatial_db_backend.py

+
+    @abstractmethod
+    def spatial_query(
+        self, center: Point, radius: float, filters: Optional[Dict[str, Any]] = None


Generally, I feel like interwining non ROS 2 code with ROS 2 types creates confusion and limits future integrations. Consider using src/rai_core/rai/types. Such separation validates the ros2 directory you've created. Let me know what you think

Such separation validates the ros2 directory you've created.

Indeed, time to walk the talk.

Nice adapter pattern: rai/types is an adapter between ROS 2 and the Python system.

It turned out to be a bit more work than I anticipated! I ended up using both the Pose from rai.types and the ROS2 message type. See the struggle here

The issue is that ROS2 transform functions (tf2_geometry_msgs) require ROS2 message types, but we also need rai.types.Pose for the Pydantic models. Using PoseType = Any with arbitrary_types_allowed=True lets Pydantic accept both, which works but isn't ideal. If you have suggestions for a cleaner approach, I'm all ears!

maciejmajek · 2025-12-09T13:12:30Z

src/rai_semap/rai_semap/scripts/navigate_collect.py

+        self.wait_tool = WaitForSecondsTool()
+        self._nav_action_ready = False
+
+    def wait_for_nav_action_server(self, timeout_sec: float = 60.0) -> bool:


from rai.communication.ros2 import wait_for_ros2_actions - I think we need to add support for timeout in ros2 waiters. The boilerplate below is what we want to eliminate with ros2connectors & their tooling

But since there is no timeout in waiters, its ok as it is

Tracked as #728

maciejmajek · 2025-12-09T13:16:55Z

src/rai_semap/rai_semap/ros2/node.py

+    def _initialize_tf(self):
+        """Initialize TF buffer and listener.
+
+        Note: ROS2Connector already creates _tf_buffer and _tf_listener internally.
+        We access the connector's TF buffer to avoid duplication.
+        """
+        # Use the connector's internal TF buffer (already created in __init__)
+        self.tf_buffer = self._tf_buffer


That was left over code, removed.

maciejmajek · 2025-12-09T13:18:22Z

src/rai_semap/rai_semap/ros2/node.py

+        map_frame_id = self._get_string_parameter("map_frame_id")
+        map_resolution = self._get_double_parameter("map_resolution")
+
+        backend = SQLiteBackend(database_path)


should most likely be configurable

maciejmajek · 2025-12-09T13:31:02Z

src/rai_semap/rai_semap/tool/query_semantic_map_tool.py

Consider adding a tool that can retrieve object types previously seen in a location.

Example:

The model is asked to bring an object that can appear in different shapes such as a bottle, can, or cup.

Finding the correct item might take three turns when using QuerySemanticMapTool, but only two turns if it can refer to a SeenObjects record.

At some point this may require context compression, but that can be handled by a future general module.

Would this do ?

class GetSeenObjectsTool(BaseTool): """Tool for retrieving object types previously seen in a location.""" name: str = "get_seen_objects" description: str = ( "Get a list of object types (e.g., 'bottle', 'cup', 'table') that have " "been previously seen in a location. Useful for discovering what objects " "exist before querying for specific instances." ) args_schema: Type[GetSeenObjectsToolInput] = GetSeenObjectsToolInput memory: SemanticMapMemory def _run(self, location_id: Optional[str] = None) -> str: """Get list of distinct object classes seen in a location.""" pass

I've added skeleton code to support this ask and actual implementation will come in subsequent PR.

maciejmajek · 2025-12-09T13:37:32Z

src/rai_semap/rai_semap/ros2/node.py

+                    vision_detection_id=vision_detection_id,
+                    metadata=metadata if metadata else existing_ann.metadata,
+                )
+                self.memory.backend.update_annotation(updated)


From architectural (LoD) standpoint, I think that the node should not reach that far (memory.backend). Would it make sense to expose update annotation or similar in memory?

src/rai_semap/rai_semap/core/semantic_map_memory.py

maciejmajek · 2025-12-09T13:42:30Z

src/rai_semap/rai_semap/ros2/node.py

+
+    def _process_detection(
+        self,
+        detection,


important type missing

maciejmajek · 2025-12-09T13:47:27Z

src/rai_semap/rai_semap/ros2/detection_publisher.py

+# It performs object detection and 3D pose computation, which are general perception
+# tasks not specific to semantic mapping. Consider moving to rai_perception when
+# that package has ROS2 node infrastructure in place.
+class DetectionPublisher(ROS2Connector):


We usually use ROS2Connector through dependency injection. I'm not opposed to inheritance, just would love to know your reasoning.

Well, they're in the ros2 folder and conceptually are ROS2 connectors. Time to walk the talk of loose coupling. I went ahead and refactored it. With dependency injection, the call chain is a bit longer than direct calling.

src/rai_semap/rai_semap/ros2/detection_publisher.py

maciejmajek · 2025-12-09T13:52:09Z

@Juliaj I've finished the review. Huge feature!

Juliaj · 2025-12-10T03:11:09Z

I've finished the review.
@maciejmajek

Huge thanks again for all the detailed review feedback! I've deferred a few items for follow-up—one is tracked as an issue, and the other involves moving code to rai_perception. I'll work on these in a subsequent PR to keep this one more focused.

maciejmajek

LGTM

Juliaj added 2 commits December 2, 2025 13:46

Initial commit for rai_semap

20bac29

refactored

4618982

Juliaj force-pushed the jj/feat/rai_slam branch from b79d56e to 4618982 Compare December 3, 2025 03:12

Fix the failing tests

9e4e0ee