-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Hi, thanks for such an all-around repo for working with 3DSG planning!
I would like to reproduce the benchmarking results in your repo under the benchmark folder to make sure everything runs properly before testing my own planners. However, during my testing, the behaviors of the planners are quite different than what are reported.
As of 07/20/2023, I ran all available planners in pddlgym_planners/__init__.py with pddl_domain taskographyv2tiny1 with the command python scripts/benchmark/plan.py --domain-name $DOMAIN_NAME --planner $PLANNER. The results are the following:
FF: error while running
gcc -o ff main.o memory.o output.o parse.o inst_pre.o inst_easy.o inst_hard.o inst_final.o orderings.o relax.o search.o scan-fct_pddl.tab.o scan-ops_pddl.tab.o -Wall -g -std=gnu99 -O6 -lm
/usr/bin/ld: search.o:/home/fjd/miniconda3/envs/taskographypy37/lib/python3.7/site-packages/pddlgym_planners/FF-v2.3/search.c:110: multiple definition oflcurrent_goals'; relax.o:/home/fjd/miniconda3/envs/taskographypy37/lib/python3.7/site-packages/pddlgym_planners/FF-v2.3/relax.c:111: first defined here /usr/bin/ld: scan-fct_pddl.tab.o:/home/fjd/miniconda3/envs/taskographypy37/lib/python3.7/site-packages/pddlgym_planners/FF-v2.3/lex-fct_pddl.l:9: multiple definition ofgbracket_count'; main.o:/home/fjd/miniconda3/envs/taskographypy37/lib/python3.7/site-packages/pddlgym_planners/FF-v2.3/main.c:147: first defined here
collect2: error: ld returned 1 exit status
make: *** [makefile:74: ff] Error 1
FF-X: the same error as FFFD-lama-first: plan failure
{'failure_rate': 1.0,
'num_node_expansions': nan,
'num_node_expansions_std': nan,
'plan_length': nan,
'plan_length_std': nan,
'search_time': nan,
'search_time_std': nan,
'success_rate': 0.0,
'timeout_rate': 0.0,
'total_time': nan,
'total_time_std': nan}
Cerberus-seq-sat: plan falure
{'failure_rate': 1.0,
'num_node_expansions': nan,
'num_node_expansions_std': nan,
'plan_length': nan,
'plan_length_std': nan,
'search_time': nan,
'search_time_std': nan,
'success_rate': 0.0,
'timeout_rate': 0.0,
'total_time': nan,
'total_time_std': nan}
Cerberus-seq-agl: plan failure
{'failure_rate': 1.0,
'num_node_expansions': nan,
'num_node_expansions_std': nan,
'plan_length': nan,
'plan_length_std': nan,
'search_time': nan,
'search_time_std': nan,
'success_rate': 0.0,
'timeout_rate': 0.0,
'total_time': nan,
'total_time_std': nan}
DecStar-agl-decoupled: plan failure
{'failure_rate': 1.0,
'num_node_expansions': nan,
'num_node_expansions_std': nan,
'plan_length': nan,
'plan_length_std': nan,
'search_time': nan,
'search_time_std': nan,
'success_rate': 0.0,
'timeout_rate': 0.0,
'total_time': nan,
'total_time_std': nan}
lapkt-bfws: slightly different behavior thanbenchmark/taskographyv2tiny1_bfws. My result:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 40/40 [03:21<00:00, 5.04s/it]
{'failure_rate': 0.0,
'num_node_expansions': 468.48387096774195,
'num_node_expansions_std': 192.6469059835003,
'plan_length': 14.709677419354838,
'plan_length_std': 3.828530825661262,
'search_time': 0.4536315483870968,
'search_time_std': 0.3696494008728636,
'success_rate': 0.775,
'timeout_rate': 0.225,
'total_time': 0.4536315483870968,
'total_time_std': 0.3696494008728636}
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 55/55 [05:57<00:00, 6.51s/it]
{'failure_rate': 0.0,
'num_node_expansions': 573.3225806451613,
'num_node_expansions_std': 338.3147405651472,
'plan_length': 15.32258064516129,
'plan_length_std': 4.394917128465223,
'search_time': 0.5754497419354839,
'search_time_std': 0.8765903350261305,
'success_rate': 0.5636363636363636,
'timeout_rate': 0.43636363636363634,
'total_time': 0.5754497419354839,
'total_time_std': 0.8765903350261305}
reported in benchmark/taskographyv2tiny1_bfws/taskographyv2tiny1_bfws_test.json:
{
"failure_rate": 0.0,
"num_node_expansions": 609.6279069767442,
"num_node_expansions_std": 339.64208406455214,
"plan_length": 15.55813953488372,
"plan_length_std": 4.15570398469826,
"search_time": 0.8969197023255813,
"search_time_std": 1.3382104019851668,
"success_rate": 0.7818181818181819,
"timeout_rate": 0.21818181818181817,
"total_time": 0.8969197023255813,
"total_time_std": 1.3382104019851668
}
FD-seq-opt-lmcut: plan failure
{'failure_rate': 1.0,
'num_node_expansions': nan,
'num_node_expansions_std': nan,
'plan_length': nan,
'plan_length_std': nan,
'search_time': nan,
'search_time_std': nan,
'success_rate': 0.0,
'timeout_rate': 0.0,
'total_time': nan,
'total_time_std': nan}
Delfi: plan failure:
{'failure_rate': 1.0,
'num_node_expansions': nan,
'num_node_expansions_std': nan,
'plan_length': nan,
'plan_length_std': nan,
'search_time': nan,
'search_time_std': nan,
'success_rate': 0.0,
'timeout_rate': 0.0,
'total_time': nan,
'total_time_std': nan}
DecStar-opt-decoupled: plan failure
{'failure_rate': 1.0,
'num_node_expansions': nan,
'num_node_expansions_std': nan,
'plan_length': nan,
'plan_length_std': nan,
'search_time': nan,
'search_time_std': nan,
'success_rate': 0.0,
'timeout_rate': 0.0,
'total_time': nan,
'total_time_std': nan}
I followed the installation stated in the https://github.com/taskography/taskography-api#installation with only a few changes to fix some errors:
0. Ubuntu 22.04.
- Conda create an empty env with python=3.7.
- Add a comma
,at the end of lineto separate the two lines.Line 26 in bcb47fc
"tqdm" - Run
pip install -e .andpip install -r requirements.txt. - Downgrade
importlib-metadatafrom 6.7.0 to 4.12.0 to avoid error'EntryPoints' object has no attribute 'get'. Source: https://stackoverflow.com/questions/73929564/entrypoints-object-has-no-attribute-get-digital-ocean - Move
from __future__ import annotationsto the first line to avoid errorfrom __future__ imports must occur at the beginning of the file. Source: https://stackoverflow.com/questions/38688504/from-future-imports-must-occur-at-the-beginning-of-the-file-what-defines - Run
scripts/validate/loader.pyandscripts/validate/taskography_env.py, pass both.
I'm willing to offer more details if needed. Highly appreciate it if you could offer some help as a solid benchmark is the pre-requisite to any possible future researches. Thanks in advance!