Connect-X-AlphaZero

This repository presents a comprehensive suite of Artificial Intelligence agents designed to master the game of Connect 4. The project was developed within the context of Kaggle's Connect X competition, aiming to study, implement, and benchmark different families of algorithms, ranging from classical search methods to state-of-the-art Deep Reinforcement Learning.

All developed agents were submitted to the official Kaggle competition, yielding revealing insights into the effectiveness of each approach. Most notably, AlphaZero's performance stands out, achieving a ranking within the global top 10. Furthermore, the implementations of classical search algorithms, MCTS and Minimax, demonstrated surprising effectiveness, significantly outperforming modern RL architectures like PPO and Rainbow, and securing solid positions on the global leaderboard.

The classical search algorithms, MCTS and Minimax, leverage pyplai, a library I previously developed, which is available on my GitHub repository and installable via pip. These agents were integrated with custom heuristics for Connect 4, demonstrating that in perfect information games with manageable state spaces, raw computational power and strategic planning offer spectacular performance, proving extremely competitive without the need for neural networks.

On the other hand, the exploration of model-free Deep Reinforcement Learning was conducted using the PPO and Rainbow algorithms. For Rainbow, an advanced variant of DQN, enhancements such as Double Q-Learning, Dueling Networks, Multi-Step Learning, Noisy Nets, Distributional Deep Reinforcement Learning, and Prioritized Experience Replay were included, utilizing the Tianshou framework for implementation. While the results were acceptable, training times proved excessively long relative to their final competitive performance. This highlighted a key limitation: as reactive algorithms that rely on "intuition" without explicit future planning, they remain vulnerable to precise long-term tactics, where a single error leads to defeat.

Finally, the AlphaZero implementation was built entirely from scratch to ensure a deep understanding and optimization of its inner workings. This approach bridges the best of both previous worlds: it uses a deep neural network to generate the "intuition" characteristic of Reinforcement Learning, while employing MCTS to validate and plan those moves. This synergy makes this agent the undisputed champion of the repository, significantly outperforming the other implemented algorithms and competing at the highest level against the most complex opponents in the competition.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
agents		agents
alphazero		alphazero
env		env
training_algorithms		training_algorithms
.gitignore		.gitignore
Connect-X-Agents.ipynb		Connect-X-Agents.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Connect-X-AlphaZero

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Connect-X-AlphaZero

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages