Skip to content

How to fix evaluation trajectories #27

@tabz23

Description

@tabz23

I have tried the following change:
def rollout(self):
"""
Evaluates the performance of the model on a single episode.
"""
episode_ret, episode_cost, episode_len = 0.0, 0.0, 0

    obs, info = self.env.reset(seed=1)

TypeError: reset() got an unexpected keyword argument 'seed'

What excatly is the source of randomness when re-running the same evaluation command? How can i always reproduce the same experiment and receive the same average reward/cost metrics for evaluation? I want to do so to compare the baseline with a small modification I did, but running the evaluation script each time produces very different results and I can't use that for comparisons.

this is the command i am using:
python "OSRL/examples/research/check/eval_bc" --device="mps" --path "OSRL/logs/OfflineSwimmerVelocityGymnasium-v1-cost-20/BC-safe_bc_modesafe_cost20_seed20-2180/BC-safe_bc_modesafe_cost20_seed20-2180" --eval_episode 1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions