Skip to content

plss12/Connect-X-AlphaZero

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Connect-X-AlphaZero

This repository presents a comprehensive suite of Artificial Intelligence agents designed to master the game of Connect 4. The project was developed within the context of Kaggle's Connect X competition, aiming to study, implement, and benchmark different families of algorithms, ranging from classical search methods to state-of-the-art Deep Reinforcement Learning.

All developed agents were submitted to the official Kaggle competition, yielding revealing insights into the effectiveness of each approach. Most notably, AlphaZero's performance stands out, achieving a ranking within the global top 10. Furthermore, the implementations of classical search algorithms, MCTS and Minimax, demonstrated surprising effectiveness, significantly outperforming modern RL architectures like PPO and Rainbow, and securing solid positions on the global leaderboard.

The classical search algorithms, MCTS and Minimax, leverage pyplai, a library I previously developed, which is available on my GitHub repository and installable via pip. These agents were integrated with custom heuristics for Connect 4, demonstrating that in perfect information games with manageable state spaces, raw computational power and strategic planning offer spectacular performance, proving extremely competitive without the need for neural networks.

On the other hand, the exploration of model-free Deep Reinforcement Learning was conducted using the PPO and Rainbow algorithms. For Rainbow, an advanced variant of DQN, enhancements such as Double Q-Learning, Dueling Networks, Multi-Step Learning, Noisy Nets, Distributional Deep Reinforcement Learning, and Prioritized Experience Replay were included, utilizing the Tianshou framework for implementation. While the results were acceptable, training times proved excessively long relative to their final competitive performance. This highlighted a key limitation: as reactive algorithms that rely on "intuition" without explicit future planning, they remain vulnerable to precise long-term tactics, where a single error leads to defeat.

Finally, the AlphaZero implementation was built entirely from scratch to ensure a deep understanding and optimization of its inner workings. This approach bridges the best of both previous worlds: it uses a deep neural network to generate the "intuition" characteristic of Reinforcement Learning, while employing MCTS to validate and plan those moves. This synergy makes this agent the undisputed champion of the repository, significantly outperforming the other implemented algorithms and competing at the highest level against the most complex opponents in the competition.

About

Reinforcement learning agents for Connect4, featuring a robust AlphaZero implementation from scratch. Includes Rainbow DQN, PPO, and classic search algorithms like MCTS and Minimax, aiming to benchmark different strategies and agent architectures.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors