Skip to content

Multi-Agent Perceiver Critic for Robotic Warehouse (RWARE) Coordination.

License

Notifications You must be signed in to change notification settings

imbulana/multi-agent-perceiver

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-Agent Perceiver12 Critic for Robotic Warehouse (RWARE) Tasks implemented in pymarlzooplus3.

Setup

See instructions for Docker here. Alternatively, create a conda environment with python 3.11

conda create -n map python=3.11
conda activate map

Install project (see common installation issues for pymarlzooplus here)

pip install -e .

Usage

Train

Launch a training run (modify the config args in the command or modify them in the config files in the config directory)

python pymarlzooplus/main.py \
    --config=map_dec \
    --env-config=gymma \
    with \
    env_args.time_limit=500 \
    env_args.key="rware:rware-tiny-4ag-hard-v1" \
    env_args.seed=742

# env_args.key options:
# [rware:rware-small-4ag-hard-v1, rware:rware-tiny-4ag-hard-v1, rware:rware-tiny-2ag-hard-v1]

Eval

Evaluate a saved checkpoint

python pymarlzooplus/main.py \
    --config=map_dec \
    --env-config=gymma \
    with \
    env_args.key="rware:rware-tiny-4ag-hard-v1" \
    env_args.time_limit=500 \
    checkpoint_path="pymarlzooplus/results/sacred/map_dec/rware:rware-tiny-4ag-hard-v1/1/models" \
    evaluate=True \
    load_step=100000 \
    test_nepisode=100

# load_step: Load model trained on this many timesteps (0 if choose max possible)
# checkpoint_path: Load model from this path

Render

To see a trained policy in action, run

python pymarlzooplus/main.py \
    --config=map_dec \
    --env-config=gymma \
    with \
    env_args.key="rware:rware-tiny-4ag-hard-v1" \
    env_args.time_limit=500 \
    checkpoint_path="pymarlzooplus/results/sacred/map_dec/rware:rware-tiny-4ag-hard-v1/0/models" \
    load_step=0 \
    evaluate=True render=True render_sleep_time=0.4

# render_sleep_time: sleep time between renders (only when render == True)
# load_step: Load model trained on this many timesteps (0 if choose max possible)
# checkpoint_path: Load model from this path

Results (wip)

Single seed runs w/ environment time_limit=500.

Task tiny-2ag-hard tiny-4ag-hard small-4ag-hard
Mean Episodic Return 17.07 ± 4.62 41.74 ± 5.00 20.06 ± 3.71
Configuration link link link

References

This implementation is largely adapted from the following repos:

perceiver-pytorch: for Perceiver IO implementation

pymarlzooplus: for training/benchmarking

Footnotes

  1. Jaegle, A., Gimeno, F., Brock, A., Zisserman, A., Vinyals, O., & Carreira, J. (2021). Perceiver: General Perception with Iterative Attention. arXiv:2103.03206

  2. Jaegle, A., Borgeaud, S., Alayrac, J.-B., Doersch, C., Ionescu, C., Ding, D., Koppula, S., Zoran, D., Brock, A., Shelhamer, E., Hénaff, O., Botvinick, M. M., Zisserman, A., Vinyals, O., & Carreira, J. (2022). Perceiver IO: A General Architecture for Structured Inputs & Outputs. arXiv:2107.14795

  3. Papadopoulos, G., Kontogiannis, A., Papadopoulou, F., Poulianou, C., Koumentis, I., & Vouros, G. (2025). An Extended Benchmarking of Multi-Agent Reinforcement Learning Algorithms in Complex Fully Cooperative Tasks. arXiv:2502.04773

About

Multi-Agent Perceiver Critic for Robotic Warehouse (RWARE) Coordination.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors