Skip to content

Commit 8bb5c22

Browse files
author
muzhancun
committed
[Doc] Update STEVE-1 doc
1 parent 1c1f7fd commit 8bb5c22

File tree

4 files changed

+581
-360
lines changed

4 files changed

+581
-360
lines changed
Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
.. _inference-steve:
2+
3+
Tutorial: Inference with STEVE-1
4+
---------------------------------
5+
6+
To inference with STEVE-1, you first need to download pretrained checkpoints.
7+
The example code is provided in ``minestudio/tutorials/inference/evaluate_steve/main.py``.
8+
9+
.. dropdown:: Evaluating STEVE-1
10+
11+
.. code-block:: python
12+
13+
from minestudio.simulator.callbacks import MinecraftCallback
14+
from minestudio.models import SteveOnePolicy
15+
from minestudio.simulator import MinecraftSim
16+
from minestudio.simulator.callbacks import SpeedTestCallback, load_callbacks_from_config
17+
from minestudio.inference import EpisodePipeline, MineGenerator, InfoBaseFilter
18+
from minestudio.benchmark import prepare_task_configs
19+
20+
import ray
21+
from functools import partial
22+
from rich import print
23+
24+
class CommandCallback(MinecraftCallback):
25+
"""
26+
To use SteveOnePolicy, you need to contain a condition in the observation.
27+
"""
28+
def __init__(self, command, cond_scale = 4.0):
29+
self.command = command
30+
self.cond_scale = cond_scale
31+
32+
def after_reset(self, sim, obs, info):
33+
self.timestep = 0
34+
obs["condition"] = {
35+
"cond_scale": self.cond_scale,
36+
"text": self.command
37+
}
38+
return obs, info
39+
40+
def after_step(self, sim, obs, reward, terminated, truncated, info):
41+
obs["condition"] = {
42+
"cond_scale": self.cond_scale,
43+
"text": self.command
44+
}
45+
return obs, reward, terminated, truncated, info
46+
47+
48+
if __name__ == '__main__':
49+
ray.init()
50+
task_configs = prepare_task_configs("simple")
51+
config_file = task_configs["collect_wood"]
52+
# you can try: survive_plant, collect_wood, build_pillar, ... ; make sure the config file contains `reference_video` field
53+
print(config_file)
54+
55+
env_generator = partial(
56+
MinecraftSim,
57+
obs_size = (224, 224),
58+
preferred_spawn_biome = "forest",
59+
callbacks = [
60+
SpeedTestCallback(50),
61+
CommandCallback("mine log", cond_scale=4.0), # Add a command callback for SteveOnePolicy
62+
] + load_callbacks_from_config(config_file)
63+
)
64+
65+
agent_generator = lambda: SteveOnePolicy.from_pretrained("CraftJarvis/MineStudio_STEVE-1.official")
66+
67+
worker_kwargs = dict(
68+
env_generator=env_generator,
69+
agent_generator=agent_generator,
70+
num_max_steps=600,
71+
num_episodes=1,
72+
tmpdir="./output",
73+
image_media="h264",
74+
)
75+
76+
pipeline = EpisodePipeline(
77+
episode_generator=MineGenerator(
78+
num_workers=1,
79+
num_gpus=0.25,
80+
max_restarts=3,
81+
**worker_kwargs,
82+
),
83+
episode_filter=InfoBaseFilter(
84+
key="mine_block",
85+
regex=".*log.*",
86+
num=1,
87+
),
88+
)
89+
summary = pipeline.run()
90+
print(summary)
91+
92+
Since STEVE-1 is a text-conditioned policy, we need to provide textual commands to guide the agent's behavior.
93+
Supported tasks and configs can be found in ``minestudio/benchmark/task_configs`` and a detailed explanation can be found in the benchmarking tutorial.
94+
95+
To pass text commands to STEVE-1, we implement a ``CommandCallback`` for the environment.
96+
The ``CommandCallback`` adds a condition field to the observation that contains:
97+
- ``cond_scale``: A scaling factor for the conditioning (default: 4.0)
98+
- ``text``: The textual command describing the desired behavior
99+
100+
After the environment is initialized, the text command will be passed to the ``'condition'`` field of the observation and then be used to guide the agent's actions.
101+
The command is applied to every observation throughout the episode, providing consistent guidance to the agent.
102+
103+
For the inference pipeline parameters, we need to specify:
104+
- task, configs and text command for the ``env_generator``.
105+
- pretrained checkpoint for the ``agent_generator``.
106+
- rollout steps, number of episodes, output path for ``worker_kwargs``.
107+
- number of gpus and workers for ``MineGenerator``.
108+
- An ``episode_filter`` to filter the episode based on the key and value of the observation.
109+
110+
In the above example, we test the STEVE-1 model on the task of collecting wood with the command "mine log" and 1 episode with 600 steps.
111+
1 worker is used with 0.25 GPU per worker.
112+
The episode will be filtered based on the key ``mine_block`` and regex pattern ``.*log.*``.
113+
114+
For common text commands for different tasks, you should refer to the original STEVE-1 paper [1]_.
115+
116+
The conditioning scale (``cond_scale``) controls how strongly the text command influences the agent's behavior:
117+
- Higher values (e.g., 6.0-8.0) make the agent follow commands more strictly
118+
- Lower values (e.g., 2.0-4.0) allow more exploration while still following the general command
119+
- The default value of 4.0 provides a good balance for most tasks
120+
121+
The summary of the pipeline will be printed to the console, showing the success rate and the number of episodes.
122+
After the pipeline is finished, the console will print the summary of the pipeline like the following:
123+
124+
.. code-block:: python
125+
126+
...
127+
128+
(Worker pid=922019) Episode 0 saved at output/episode_0.mp4
129+
(Worker pid=922019) Speed Test Status:
130+
(Worker pid=922019) Average Time: 0.04
131+
(Worker pid=922019) Average FPS: 24.28
132+
(Worker pid=922019) Total Steps: 600
133+
{'num_yes': 1, 'num_episodes': 1, 'yes_rate': '100.00%'}
134+
135+
.. [1] Lifshitz S, Paster K, Chan H, et al. Steve-1: A generative model for text-to-behavior in minecraft[J]. Advances in Neural Information Processing Systems, 2024, 36.

docs/source/inference/index.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
<!--
22
* @Date: 2024-11-29 08:10:04
33
* @LastEditors: muzhancun muzhancun@stu.pku.edu.cn
4-
* @LastEditTime: 2025-05-29 13:28:17
4+
* @LastEditTime: 2025-06-02 14:21:12
55
* @FilePath: /MineStudio/docs/source/inference/index.md
66
-->
77
# Inference
@@ -17,6 +17,7 @@ We highly recommend readers to read the ray documentation before using the infer
1717
1818
baseline-vpt
1919
baseline-groot
20+
baseline-steve1
2021
```
2122

2223
## Quick Start

0 commit comments

Comments
 (0)