New Random Tower Defense (SC2 Arcade) – AlphaZero-style Prototype 🤖

Status: Experimental Prototype
This repository is an AI-assisted “vibe coding” experiment, not a finished or polished game AI.

What This Is (and What It Is Not)

This repository contains a small Python prototype that applies an AlphaZero-style search (MCTS + neural network) to a simplified, offline version of the StarCraft II arcade map “New Random Tower Defense”.

To avoid confusion:

It is not a general StarCraft II reinforcement learning agent.
It does not connect to the live SC2 client or play ladder games.
It does not represent my low-level implementation skills in RL or deep learning.

Instead, this code runs a standalone grid-based simulation that imitates some of the decision structure of the arcade tower defense game (e.g. tower placement along fixed mob paths), in order to test an AlphaZero-style loop in a toy setting.

🤖 AI-Assisted “Vibe Coding” Disclaimer

This project was built as an explicit AI-assisted “vibe coding” experiment:

Most of the concrete Python code was written by AI (LLMs).
- MCTS scaffolding
- Environment wrappers
- Training loop boilerplate
- Model class structure
My contribution is primarily conceptual and architectural:
- Describing the mechanics and constraints of New Random Tower Defense
- Proposing state and action representations (grid, lanes, tower slots)
- Designing and iterating on reward ideas
- Steering and editing AI-generated code to roughly match those ideas

Please do not treat this repository as a portfolio of “pure hand-written RL code.”
It is closer to a log of how I used AI tools to prototype and explore an idea.

Overview of the Prototype

The main goals of this prototype are:

to represent tower-defense states on a discrete grid,
to use Monte Carlo Tree Search (MCTS) guided by a neural network,
and to see whether AlphaZero-style self-play can discover sensible tower placement policies in a simplified offline environment.

It should be viewed as:

experimental,
unstable,
and primarily educational / exploratory.

File Structure

Core files:

alpha_mcts.py
Monte Carlo Tree Search implementation:
- selection
- expansion
- simulation
- backpropagation
alpha_env.py
Custom environment wrapper:
- defines the grid-based state and action space
- encodes a simplified tower-defense reward
- tracks mob progress along lanes
alpha_model.py
Neural network used by MCTS:
- policy head (action probabilities over legal tower placements)
- value head (estimated outcome of a state)
alpha_train.py
AlphaZero-style training loop:
- runs self-play episodes using MCTS + current network
- stores game histories
- periodically updates the network from collected data
alpha_rtd.py
Example script for running the agent in a specific Random Tower Defense scenario.
alpha_common.py
Shared helpers:
- loading grid/waypoint CSVs
- configuration utilities
config.json
Central configuration for hyperparameters (learning rate, number of simulations, etc.).

Data files:

mob_path_waypoints_v2.csv
Approximate creep/mob path waypoints for the offline grid simulation.
grid with lane and slot.csv
Grid layout specification:
- lane paths
- valid tower slots
- blocked cells

Installation & Usage

1. Clone the repository

git clone https://github.com/7riangle/sc2-rtd-alphazero.git
cd sc2-rtd-alphazero

2. Install dependencies

Requires Python 3.8+.

If you have a requirements.txt:

pip install -r requirements.txt

Or install core libraries manually:

pip install numpy pandas gymnasium torch

Optionally, for plotting and progress bars:

pip install matplotlib tqdm

3. Data / File Placement

Currently, the code expects the CSV files to be in the repository root directory:

mob_path_waypoints_v2.csv
grid with lane and slot.csv

If you move them into a data/ folder, you will need to update the paths in alpha_common.py (and any other script that loads these files).

4. Running the Prototype

4.1 Training (self-play)

python alpha_train.py

This starts an AlphaZero-style self-play loop in the offline grid environment:

the agent generates games using MCTS + the current network,
game data is stored,
and the network is updated from these examples.

4.2 Running a scenario

python alpha_rtd.py

This runs the agent in a configured tower-defense scenario, using the current model parameters and environment settings.

Current Limitations & To-Do

MCTS behaviour in terminal / edge states needs further debugging.
Environment code should be refactored to support flexible relative paths (e.g. data/ directory).
Reward signals are simple and may need redesign to avoid degenerate behaviour.
Code structure could be simplified (some functions are longer than necessary).
No formal evaluation or baselines are implemented yet.

Scope & Intent

To be explicit:

This project does not connect to the live StarCraft II client.
It does not control units or play online games.
It is a standalone Python simulation that borrows the basic idea of the “New Random Tower Defense” arcade map and applies an AlphaZero-style search to a simplified grid version of that idea.
The code is largely AI-generated, based on my high-level descriptions of the game logic and reward structure.

If you are interested in this kind of experiment, feel free to fork, modify, or strip down the code for your own tower-defense or grid-based toy environments.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
alpha_RTD		alpha_RTD
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

New Random Tower Defense (SC2 Arcade) – AlphaZero-style Prototype 🤖

What This Is (and What It Is Not)

🤖 AI-Assisted “Vibe Coding” Disclaimer

Overview of the Prototype

File Structure

Installation & Usage

1. Clone the repository

2. Install dependencies

3. Data / File Placement

4. Running the Prototype

4.1 Training (self-play)

4.2 Running a scenario

Current Limitations & To-Do

Scope & Intent

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

New Random Tower Defense (SC2 Arcade) – AlphaZero-style Prototype 🤖

What This Is (and What It Is Not)

🤖 AI-Assisted “Vibe Coding” Disclaimer

Overview of the Prototype

File Structure

Installation & Usage

1. Clone the repository

2. Install dependencies

3. Data / File Placement

4. Running the Prototype

4.1 Training (self-play)

4.2 Running a scenario

Current Limitations & To-Do

Scope & Intent

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages