Skip to content

agent_loop in evaluation.py randomizes agent types #40

@philipjhj

Description

@philipjhj

In the agent_loop in evaluation.py (with a fix for #43 used)

[...]
while episode < EVAL_EPISODES:
        # check if env needs reset
        if env.done:
            print('Episode %d (%.2f)%%' % (episode, (episode / EVAL_EPISODES) * 100.))
            
            # **** fixes #43
            if type(agent.current_agent) == RandomAgent:
                    agent_type = PigChaseEnvironment.AGENT_TYPE_1
                else:
                    agent_type = PigChaseEnvironment.AGENT_TYPE_2
            
            obs = env.reset(agent_type)
            # ****
            while obs is None:
                # this can happen if the episode ended with the first
                # action of the other agent
                print('Warning: received obs == None.')
                obs = env.reset()

            episode += 1

        # select an action
        action = agent.act(obs, reward, agent_done, is_training=True)
        # take a step
        obs, reward, agent_done = env.do(action)

Since agent.act([...],agent_done=True,[...]) resets the current_agent used, it needs to happen before env.reset(agent_type). Otherwise the appearance of the agent is based on the agent type used during the previous episode.

By simply moving the agent.act([...]) call above the if env.done, leads to an action from the previous episode to be given as the first action in the subsequent episode.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions