Skip to content

Commit 2834e0b

Browse files
authored
feat(model): Integrate SAM3D as one of image to 3D model option. (#61)
Support the use of [SAM3D](https://github.com/facebookresearch/sam-3d-objects) or [TRELLIS](https://github.com/microsoft/TRELLIS) as 3D generation model.
1 parent 74c3c52 commit 2834e0b

39 files changed

+1192
-474
lines changed

.gitmodules

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,8 @@
88
url = https://github.com/TrickyGo/Pano2Room.git
99
branch = main
1010
shallow = true
11+
[submodule "thirdparty/sam3d"]
12+
path = thirdparty/sam3d
13+
url = https://github.com/HochCC/sam-3d-objects.git
14+
branch = main
15+
shallow = true

README.md

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -37,11 +37,12 @@
3737
```sh
3838
git clone https://github.com/HorizonRobotics/EmbodiedGen.git
3939
cd EmbodiedGen
40-
git checkout v0.1.6
40+
git checkout v0.1.7
4141
git submodule update --init --recursive --progress
4242
conda create -n embodiedgen python=3.10.13 -y # recommended to use a new env.
4343
conda activate embodiedgen
44-
bash install.sh basic
44+
bash install.sh basic # around 20 mins
45+
# Optional: `bash install.sh extra` for scene3d-cli
4546
```
4647

4748
### ✅ Starting from Docker
@@ -94,12 +95,14 @@ CUDA_VISIBLE_DEVICES=0 nohup python apps/image_to_3d.py > /dev/null 2>&1 &
9495
### ⚡ API
9596
Generate physically plausible 3D assets from image input via the command-line API.
9697
```sh
97-
img3d-cli --image_path apps/assets/example_image/sample_00.jpg apps/assets/example_image/sample_01.jpg apps/assets/example_image/sample_19.jpg \
98+
img3d-cli --image_path apps/assets/example_image/sample_00.jpg apps/assets/example_image/sample_01.jpg \
9899
--n_retry 1 --output_root outputs/imageto3d
99100

100101
# See result(.urdf/mesh.obj/mesh.glb/gs.ply) in ${output_root}/sample_xx/result
101102
```
102103

104+
Support the use of [SAM3D](https://github.com/facebookresearch/sam-3d-objects) or [TRELLIS](https://github.com/microsoft/TRELLIS) as 3D generation model, modify `IMAGE3D_MODEL` in `embodied_gen/scripts/imageto3d.py` to switch model.
105+
103106
---
104107

105108

@@ -133,7 +136,7 @@ text3d-cli --prompts "small bronze figurine of a lion" "A globe with wooden base
133136
Text-to-image model based on the Kolors model.
134137
```sh
135138
bash embodied_gen/scripts/textto3d.sh \
136-
--prompts "small bronze figurine of a lion" "A globe with wooden base and latitude and longitude lines" "橙色电动手钻,有磨损细节" \
139+
--prompts "A globe with wooden base and latitude and longitude lines" "橙色电动手钻,有磨损细节" \
137140
--output_root outputs/textto3d_k
138141
```
139142
ps: models with more permissive licenses found in `embodied_gen/models/image_comm_model.py`
@@ -191,7 +194,11 @@ CUDA_VISIBLE_DEVICES=0 scene3d-cli \
191194

192195
<h2 id="articulated-object-generation">⚙️ Articulated Object Generation</h2>
193196

194-
🚧 *Coming Soon*
197+
See our paper published in NeurIPS 2025.
198+
[[Arxiv Paper]](https://arxiv.org/abs/2505.20460) |
199+
[[Gradio Demo]](https://huggingface.co/spaces/HorizonRobotics/DIPO) |
200+
[[Code]](https://github.com/RQ-Wu/DIPO)
201+
195202

196203
<img src="docs/assets/articulate.gif" alt="articulate" style="width: 500px;">
197204

@@ -239,6 +246,7 @@ Remove `--insert_robot` if you don't consider the robot pose in layout generatio
239246
CUDA_VISIBLE_DEVICES=0 nohup layout-cli \
240247
--task_descs "apps/assets/example_layout/task_list.txt" \
241248
--bg_list "outputs/bg_scenes/scene_list.txt" \
249+
--n_image_retry 4 --n_asset_retry 3 --n_pipe_retry 2 \
242250
--output_root "outputs/layouts_gens" --insert_robot > layouts_gens.log &
243251
```
244252

@@ -325,7 +333,7 @@ If you use EmbodiedGen in your research or projects, please cite:
325333
## 🙌 Acknowledgement
326334

327335
EmbodiedGen builds upon the following amazing projects and models:
328-
🌟 [Trellis](https://github.com/microsoft/TRELLIS) | 🌟 [Hunyuan-Delight](https://huggingface.co/tencent/Hunyuan3D-2/tree/main/hunyuan3d-delight-v2-0) | 🌟 [Segment Anything](https://github.com/facebookresearch/segment-anything) | 🌟 [Rembg](https://github.com/danielgatis/rembg) | 🌟 [RMBG-1.4](https://huggingface.co/briaai/RMBG-1.4) | 🌟 [Stable Diffusion x4](https://huggingface.co/stabilityai/stable-diffusion-x4-upscaler) | 🌟 [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) | 🌟 [Kolors](https://github.com/Kwai-Kolors/Kolors) | 🌟 [ChatGLM3](https://github.com/THUDM/ChatGLM3) | 🌟 [Aesthetic Score](http://captions.christoph-schuhmann.de/aesthetic_viz_laion_sac+logos+ava1-l14-linearMSE-en-2.37B.html) | 🌟 [Pano2Room](https://github.com/TrickyGo/Pano2Room) | 🌟 [Diffusion360](https://github.com/ArcherFMY/SD-T2I-360PanoImage) | 🌟 [Kaolin](https://github.com/NVIDIAGameWorks/kaolin) | 🌟 [diffusers](https://github.com/huggingface/diffusers) | 🌟 [gsplat](https://github.com/nerfstudio-project/gsplat) | 🌟 [QWEN-2.5VL](https://github.com/QwenLM/Qwen2.5-VL) | 🌟 [GPT4o](https://platform.openai.com/docs/models/gpt-4o) | 🌟 [SD3.5](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium) | 🌟 [ManiSkill](https://github.com/haosulab/ManiSkill)
336+
🌟 [Trellis](https://github.com/microsoft/TRELLIS) | 🌟 [Hunyuan-Delight](https://huggingface.co/tencent/Hunyuan3D-2/tree/main/hunyuan3d-delight-v2-0) | 🌟 [Segment Anything](https://github.com/facebookresearch/segment-anything) | 🌟 [Rembg](https://github.com/danielgatis/rembg) | 🌟 [RMBG-1.4](https://huggingface.co/briaai/RMBG-1.4) | 🌟 [Stable Diffusion x4](https://huggingface.co/stabilityai/stable-diffusion-x4-upscaler) | 🌟 [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) | 🌟 [Kolors](https://github.com/Kwai-Kolors/Kolors) | 🌟 [ChatGLM3](https://github.com/THUDM/ChatGLM3) | 🌟 [Aesthetic Score](http://captions.christoph-schuhmann.de/aesthetic_viz_laion_sac+logos+ava1-l14-linearMSE-en-2.37B.html) | 🌟 [Pano2Room](https://github.com/TrickyGo/Pano2Room) | 🌟 [Diffusion360](https://github.com/ArcherFMY/SD-T2I-360PanoImage) | 🌟 [Kaolin](https://github.com/NVIDIAGameWorks/kaolin) | 🌟 [diffusers](https://github.com/huggingface/diffusers) | 🌟 [gsplat](https://github.com/nerfstudio-project/gsplat) | 🌟 [QWEN-2.5VL](https://github.com/QwenLM/Qwen2.5-VL) | 🌟 [GPT4o](https://platform.openai.com/docs/models/gpt-4o) | 🌟 [SD3.5](https://huggingface.co/stabilityai/stable-diffusion-3.5-medium) | 🌟 [ManiSkill](https://github.com/haosulab/ManiSkill) | 🌟 [SAM3D](https://github.com/facebookresearch/sam-3d-objects)
329337

330338
---
331339

apps/app_style.py

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,26 @@
1+
# Project EmbodiedGen
2+
#
3+
# Copyright (c) 2025 Horizon Robotics. All Rights Reserved.
4+
#
5+
# Licensed under the Apache License, Version 2.0 (the "License");
6+
# you may not use this file except in compliance with the License.
7+
# You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
14+
# implied. See the License for the specific language governing
15+
# permissions and limitations under the License.
16+
117
from gradio.themes import Soft
218
from gradio.themes.utils.colors import gray, neutral, slate, stone, teal, zinc
319

420
lighting_css = """
521
<style>
622
#lighter_mesh canvas {
7-
filter: brightness(2.0) !important;
23+
filter: brightness(2.3) !important;
824
}
925
</style>
1026
"""

0 commit comments

Comments
 (0)