AlphaTriangle is a project implementing an artificial intelligence agent based on AlphaZero principles to learn and play a custom puzzle game involving placing triangular shapes onto a grid. The agent learns through headless self-play reinforcement learning, guided by Monte Carlo Tree Search (MCTS) and a deep neural network (PyTorch).
Key Features:
- Core Game Logic: Uses the
trianglengin>=2.0.7library for the triangle puzzle game rules and state management, featuring a high-performance C++ core. - High-Performance MCTS: Integrates the
trimcts>=1.2.1library, providing a C++ implementation of MCTS for efficient search, callable from Python. MCTS parameters are configurable viaalphatriangle/config/mcts_config.py. - Deep Learning Model: Features a PyTorch neural network with policy and distributional value heads, convolutional layers, and optional Transformer Encoder layers.
- Parallel Self-Play: Leverages Ray for distributed self-play data generation across multiple CPU cores. The number of workers automatically adjusts based on detected CPU cores (reserving some for stability), capped by the
NUM_SELF_PLAY_WORKERSsetting inTrainConfig. - Asynchronous Stats & Persistence (NEW): Uses the
trieye>=0.1.2library, which provides a dedicated Ray actor (TrieyeActor) for:- Asynchronous collection of raw metric events from workers and the training loop.
- Configurable processing and aggregation of metrics.
- Logging processed metrics to MLflow and TensorBoard.
- Saving/loading training state (checkpoints, buffers) to the filesystem and logging artifacts to MLflow.
- Handling auto-resumption from previous runs.
- All persistent data managed by
trieyeis stored within the.trieye_data/<app_name>directory (default:.trieye_data/alphatriangle).
- Headless Training: Focuses on a command-line interface for running the training pipeline without visual output.
- Enhanced CLI: Uses the
richlibrary for improved visual feedback (colors, panels, emojis) in the terminal. - Centralized Logging: Uses Python's standard
loggingmodule configured centrally for consistent log formatting (includingโฒprefix) and level control across the project. Run logs are saved to.trieye_data/<app_name>/runs/<run_name>/logs/. - Optional Profiling: Supports profiling worker 0 using
cProfilevia a command-line flag. Profile data is saved to.trieye_data/<app_name>/runs/<run_name>/profile_data/. - Unit Tests: Includes tests for RL components.
This project trains an agent to play the game defined by the trianglengin library. The rules are detailed in the trianglengin README.
- Python 3.10+
- trianglengin>=2.0.7: Core game engine (state, actions, rules) with C++ optimizations.
- trimcts>=1.2.1: High-performance C++ MCTS implementation with Python bindings.
- trieye>=0.1.2: Asynchronous statistics collection, processing, logging (MLflow/TensorBoard), and data persistence via a Ray actor.
- PyTorch: For the deep learning model (CNNs, optional Transformers, Distributional Value Head) and training, with CUDA/MPS support.
- NumPy: For numerical operations, especially state representation (used by
trianglenginand features). - Ray: For parallelizing self-play data generation and hosting the
TrieyeActor. Dynamically scales worker count based on available cores. - Numba: (Optional, used in
features.grid_features) For performance optimization of specific grid calculations. - Cloudpickle: For serializing the experience replay buffer and training checkpoints (used by
trieye). - MLflow: For logging parameters, metrics, and artifacts (checkpoints, buffers) during training runs. Provides the primary web UI dashboard for experiment management. Data stored in
.trieye_data/<app_name>/mlruns/. Managed bytrieye. - TensorBoard: For visualizing metrics during training (e.g., detailed loss curves). Provides a secondary web UI dashboard. Data stored in
.trieye_data/<app_name>/runs/<run_name>/tensorboard/. Managed bytrieye. - Pydantic: For configuration management and data validation (used by
alphatriangleandtrieye). - Typer: For the command-line interface.
- Rich: For enhanced CLI output formatting and styling.
- Pytest: For running unit tests.
.
โโโ .github/workflows/ # GitHub Actions CI/CD
โ โโโ ci_cd.yml
โโโ .trieye_data/ # Root directory for ALL persistent data (GITIGNORED) - Managed by Trieye
โ โโโ alphatriangle/ # Default app_name
โ โโโ mlruns/ # MLflow internal tracking data & artifact store (for MLflow UI)
โ โ โโโ <experiment_id>/
โ โ โโโ <mlflow_run_id>/
โ โ โโโ artifacts/ # MLflow's copy of logged artifacts (checkpoints, buffers, etc.)
โ โ โโโ metrics/
โ โ โโโ params/
โ โ โโโ tags/
โ โโโ runs/ # Local artifacts per run (source for TensorBoard UI & resume)
โ โโโ <run_name>/ # e.g., train_YYYYMMDD_HHMMSS
โ โโโ checkpoints/ # Saved model weights & optimizer states (*.pkl)
โ โโโ buffers/ # Saved experience replay buffers (*.pkl)
โ โโโ logs/ # Plain text log files for the run (*.log) - App + Trieye logs
โ โโโ tensorboard/ # TensorBoard log files (event files)
โ โโโ profile_data/ # cProfile output files (*.prof) if profiling enabled
โ โโโ configs.json # Copy of run configuration
โโโ alphatriangle/ # Source code for the AlphaZero agent package
โ โโโ __init__.py
โ โโโ cli.py # CLI logic (train, ml, tb, ray commands - headless only, uses Rich)
โ โโโ config/ # Pydantic configuration models (Model, Train, MCTS) - Stats/Persistence now in Trieye
โ โ โโโ README.md
โ โโโ features/ # Feature extraction logic (operates on trianglengin.GameState)
โ โ โโโ README.md
โ โโโ logging_config.py # Centralized logging setup function (for console)
โ โโโ nn/ # Neural network definition and wrapper (implements trimcts.AlphaZeroNetworkInterface)
โ โ โโโ README.md
โ โโโ rl/ # RL components (Trainer, Buffer, Worker using trimcts)
โ โ โโโ README.md
โ โโโ training/ # Training orchestration (Loop, Setup, Runner, WorkerManager) - Interacts with TrieyeActor
โ โ โโโ README.md
โ โโโ utils/ # Shared utilities and types (specific to AlphaTriangle)
โ โโโ README.md
โโโ tests/ # Unit tests (for alphatriangle components, excluding MCTS, Stats, Data)
โ โโโ conftest.py
โ โโโ nn/
โ โโโ rl/
โ โโโ training/
โโโ .gitignore
โโโ .python-version
โโโ LICENSE # License file (MIT)
โโโ MANIFEST.in # Specifies files for source distribution
โโโ pyproject.toml # Build system & package configuration (depends on trianglengin, trimcts, trieye, rich)
โโโ README.md # This file
โโโ requirements.txt # List of dependencies (includes trianglengin, trimcts, trieye, rich)cli: Defines the command-line interface using Typer (train,ml,tb,raycommands - headless). Usesrichfor styling. (alphatriangle/cli.py)config: Centralized Pydantic configuration classes (Model, Train, MCTS). ImportsEnvConfigfromtrianglengin. UsesTrieyeConfigfromtrieyefor stats/persistence. (alphatriangle/config/README.md)features: Contains logic to converttrianglengin.GameStateobjects into numerical features (StateType). (alphatriangle/features/README.md)logging_config: Defines thesetup_loggingfunction for centralized console logger configuration. (alphatriangle/logging_config.py)nn: Contains the PyTorchnn.Moduledefinition (AlphaTriangleNet) and a wrapper class (NeuralNetwork). TheNeuralNetworkclass implicitly conforms to thetrimcts.AlphaZeroNetworkInterfaceprotocol. (alphatriangle/nn/README.md)rl: Contains RL components:Trainer(network updates),ExperienceBuffer(data storage, supports PER), andSelfPlayWorker(Ray actor for parallel self-play usingtrimcts.run_mcts). Workers now sendRawMetricEvents to theTrieyeActor. (alphatriangle/rl/README.md)training: Orchestrates the headless training process usingTrainingLoop, managing workers, data flow, and interaction with theTrieyeActorfor logging, checkpointing, and state loading. Includesrunner.pyfor the callable training function. (alphatriangle/training/README.md)utils: Provides common helper functions and shared type definitions specific to the AlphaZero implementation. (alphatriangle/utils/README.md)
- Clone the repository (for development):
git clone https://github.com/lguibr/alphatriangle.git cd alphatriangle - Create a virtual environment (recommended):
python -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate`
- Install the package (including
trianglengin,trimcts,trieye, andrich):- For users:
# This will automatically install trianglengin, trimcts, trieye, and rich from PyPI if available pip install alphatriangle # Or install directly from Git (installs dependencies from PyPI) # pip install git+https://github.com/lguibr/alphatriangle.git
- For developers (editable install):
- First, ensure
trianglengin,trimcts, andtrieyeare installed (ideally in editable mode from their own directories if developing all):# From the trianglengin directory (requires C++ build tools): # pip install -e . # From the trimcts directory (requires C++ build tools): # pip install -e . # From the trieye directory: # pip install -e .
- Then, install
alphatrianglein editable mode:# From the alphatriangle directory: pip install -e . # Install dev dependencies (optional, for running tests/linting) pip install -e .[dev] # Installs dev deps from pyproject.toml
- First, ensure
trianglenginandtrimctsrequire a C++ compiler (like GCC, Clang, or MSVC) and CMake. - For users:
- (Optional but Recommended) Add data directory to
.gitignore: Ensure the.gitignorefile in your project root contains the line:.trieye_data/
Use the alphatriangle command for training and monitoring. The CLI uses rich for enhanced output.
- Show Help:
alphatriangle --help
- Run Training (Headless Only):
alphatriangle train [--seed 42] [--log-level INFO] [--profile] [--run-name my_custom_run]
--log-level: Set console logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL). Default: INFO. Logs are saved to.trieye_data/alphatriangle/runs/<run_name>/logs/.--seed: Set the random seed for reproducibility. Default: 42.--profile: Enable cProfile for worker 0. Generates.proffiles in.trieye_data/alphatriangle/runs/<run_name>/profile_data/.--run-name: Specify a custom name for the run. Default:train_YYYYMMDD_HHMMSS.
- Launch MLflow UI:
Launches the MLflow web interface, automatically pointing to the
.trieye_data/alphatriangle/mlrunsdirectory.Access viaalphatriangle ml [--host 127.0.0.1] [--port 5000]
http://localhost:5000(or the specified host/port). - Launch TensorBoard UI:
Launches the TensorBoard web interface, automatically pointing to the
.trieye_data/alphatriangle/runsdirectory (which contains the individual run subdirectories withtensorboardlogs).Access viaalphatriangle tb [--host 127.0.0.1] [--port 6006]
http://localhost:6006(or the specified host/port). - Launch Ray Dashboard UI:
Launches the Ray Dashboard web interface. Note: This typically requires a Ray cluster to be running (e.g., started by
alphatriangle trainor manually).Access viaalphatriangle ray [--host 127.0.0.1] [--port 8265]
http://localhost:8265(or the specified host/port). - Interactive Play/Debug (Use
trianglenginCLI): Note: Interactive modes are part of thetrianglenginlibrary, not thisalphatrianglepackage.# Ensure trianglengin is installed trianglengin play [--seed 42] [--log-level INFO] trianglengin debug [--seed 42] [--log-level DEBUG] - Running Unit Tests (Development):
pytest tests/
- Analyzing Profile Data (if
--profilewas used): Use the providedanalyze_profiles.pyscript (requirestyper).python analyze_profiles.py .trieye_data/alphatriangle/runs/<run_name>/profile_data/worker_0_ep_<ep_seed>.prof [-n <num_lines>]
- AlphaTriangle Specific: Parameters for the Model (
ModelConfig), Training (TrainConfig), and MCTS (AlphaTriangleMCTSConfig) are defined in the Pydantic classes within thealphatriangle/config/directory. - Environment: Environment configuration (
EnvConfig) is defined within thetrianglenginlibrary. - Stats & Persistence: Statistics logging and data persistence are configured via
TrieyeConfig(which includesStatsConfigandPersistenceConfig) from thetrieyelibrary. These are typically instantiated inalphatriangle/training/runner.pyoralphatriangle/cli.py.
All persistent data is now managed by the trieye library and stored within the .trieye_data/<app_name>/ directory (default: .trieye_data/alphatriangle/) in the project root. This directory should be added to your .gitignore.
.trieye_data/<app_name>/mlruns/: Managed by MLflow (viatrieye). Contains MLflow's internal tracking data (parameters, metrics) and its own copy of logged artifacts. This is the source for the MLflow UI (alphatriangle ml)..trieye_data/<app_name>/runs/: Managed bytrieye. Contains locally saved artifacts for each run (checkpoints, buffers, TensorBoard logs, configs, profile data) before/during logging to MLflow. This directory is used for auto-resuming and is the source for the TensorBoard UI (alphatriangle tb).- Replay Buffer Content: The saved buffer file (
buffer.pkl) containsExperiencetuples:(StateType, PolicyTargetMapping, n_step_return). TheStateTypeincludes:grid: Numerical features representing grid occupancy.other_features: Numerical features derived from game state and available shapes.
- Visualization: This stored data allows offline analysis and visualization of grid occupancy and the available shapes for each recorded step. It does not contain the full sequence of actions or raw
GameStateobjects needed for a complete, interactive game replay.
- Replay Buffer Content: The saved buffer file (
This project includes README files within each major alphatriangle submodule. Please keep these READMEs updated when making changes to the code's structure, interfaces, or core logic, especially regarding the interaction with the trieye library.
