-
Notifications
You must be signed in to change notification settings - Fork 87
Open
Description
In the agent_loop in evaluation.py (with a fix for #43 used)
[...]
while episode < EVAL_EPISODES:
# check if env needs reset
if env.done:
print('Episode %d (%.2f)%%' % (episode, (episode / EVAL_EPISODES) * 100.))
# **** fixes #43
if type(agent.current_agent) == RandomAgent:
agent_type = PigChaseEnvironment.AGENT_TYPE_1
else:
agent_type = PigChaseEnvironment.AGENT_TYPE_2
obs = env.reset(agent_type)
# ****
while obs is None:
# this can happen if the episode ended with the first
# action of the other agent
print('Warning: received obs == None.')
obs = env.reset()
episode += 1
# select an action
action = agent.act(obs, reward, agent_done, is_training=True)
# take a step
obs, reward, agent_done = env.do(action)Since agent.act([...],agent_done=True,[...]) resets the current_agent used, it needs to happen before env.reset(agent_type). Otherwise the appearance of the agent is based on the agent type used during the previous episode.
By simply moving the agent.act([...]) call above the if env.done, leads to an action from the previous episode to be given as the first action in the subsequent episode.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels