A multi-agent system Q-learning with sparse cooperation
Right now, the best joint action is computed with a naive method and not with the elimination algorithm : we just compute the payoff for every combination of the action and we take the best
- learn mode
Let the agents learn a policy during n episodes
python3 main.py learn [directory] -e episode -g grid -v- directory : directory to store the rules file
- episode : number of episode
- grid : grid size of the prey-predators game
- play mode
Play the game with a learned policy
python3 main.py play [directory] -g grid- directory : directory to store the rules file
- grid : grid size of the prey-predators game
- test mode
Test the performance of the learning
python3 main.py test [directory] -e episode -r run -g grid -v- episode : number of episode
- run : number of run
- grid : grid size of the prey-predators game


