diff --git a/.github/TESTING_RELEASES.md b/.github/TESTING_RELEASES.md new file mode 100644 index 000000000..9d0e77a62 --- /dev/null +++ b/.github/TESTING_RELEASES.md @@ -0,0 +1,160 @@ +# Testing the Release Workflow + +This guide explains how to test the automated crate publishing workflow before actually publishing to crates.io. + +## Quick Start + +### Option 1: Manual Dry-Run (Recommended) + +The safest way to test the release workflow: + +1. **Go to GitHub Actions** + - Navigate to: `https://github.com/microsoft/DiskANN/actions/workflows/publish.yml` + - Click the "Run workflow" dropdown button + +2. **Configure the test run** + - **Branch**: Select your branch (e.g., `main` or a release branch) + - **Run in dry-run mode**: Keep as `true` (default) + - Click the green "Run workflow" button + +3. **Monitor the test** + - Watch the workflow execution in real-time + - All steps will run, but nothing will be published + - Look for: "✅ Dry-run completed successfully!" + +### Option 2: Local Testing with cargo + +Test individual crates locally: + +```bash +# Test a single crate +cargo publish --dry-run --package diskann-wide + +# Test all crates in order +cargo publish --dry-run --package diskann-wide +cargo publish --dry-run --package diskann-vector +cargo publish --dry-run --package diskann-platform +# ... etc +``` + +## What Gets Tested in Dry-Run Mode + +✅ **Tested:** +- Rust installation and caching +- Full test suite execution +- Crate metadata validation +- Packaging verification +- Dependency resolution +- Publish order correctness + +❌ **NOT Tested:** +- Actual publishing to crates.io +- Crate availability timing +- Registry token authentication +- Network upload reliability + +## When to Run a Dry-Run Test + +**Always test before:** +- Your first release +- Modifying the publish workflow +- Major version bumps +- Significant dependency changes + +**Optional but recommended:** +- Minor/patch releases +- After long periods between releases + +## Interpreting Results + +### Success ✅ + +You'll see: +``` +======================================== +✅ Dry-run completed successfully! +All crates passed validation. +======================================== +``` + +**Next step**: You're ready to create a release tag and publish for real. + +### Failure ❌ + +Common issues and solutions: + +1. **Test failures** + - Fix failing tests before releasing + - Check the test output in the workflow logs + +2. **Invalid metadata** + - Review `Cargo.toml` files for issues + - Ensure all required fields are present + +3. **Dependency issues** + - Check for circular dependencies + - Verify version constraints are correct + +4. **Packaging errors** + - Look for missing files or incorrect paths + - Check `.gitignore` doesn't exclude required files + +## Comparing Dry-Run vs Live Release + +| Aspect | Dry-Run | Live Release | +|--------|---------|--------------| +| Trigger | Manual dispatch | Version tag push | +| Tests | ✓ Full suite | ✓ Full suite | +| Validation | ✓ All checks | ✓ All checks | +| Publishes | ✗ Simulated | ✓ Actual | +| Waits for availability | ✗ Skipped | ✓ 5min timeout | +| Version tag required | ✗ No | ✓ Yes | +| CRATES_IO_TOKEN used | ✗ No* | ✓ Yes | +| Creates release notes | ✗ No | ✓ Yes | + +\* Token not required but harmless if present + +## Troubleshooting + +### "Workflow not found" + +Make sure your branch has the updated workflow file. The dry-run feature was added in the latest version. + +### "DRY_RUN variable not set" + +This is expected for tag-triggered releases. The workflow automatically sets `DRY_RUN=false` for real releases. + +### "Tests pass locally but fail in workflow" + +- Check rust version matches: `1.92` (defined in workflow) +- Ensure all dependencies are in `Cargo.lock` +- Review LFS files if using Git LFS + +## Example: Complete Pre-Release Test + +```bash +# 1. Update version locally +vim Cargo.toml # Change version to 0.46.0 + +# 2. Commit to a branch (don't tag yet!) +git checkout -b release-0.46.0 +git commit -am "Bump version to 0.46.0" +git push origin release-0.46.0 + +# 3. Run dry-run test via GitHub UI +# - Go to Actions → Publish to crates.io +# - Run workflow on release-0.46.0 branch +# - Keep dry_run=true + +# 4. If successful, merge to main and tag +git checkout main +git merge release-0.46.0 +git tag v0.46.0 +git push origin main --tags # This triggers real publish +``` + +## Need Help? + +- Check the [RELEASING.md](../../RELEASING.md) guide +- Review workflow logs in GitHub Actions +- Open an issue if you find a bug in the workflow diff --git a/.github/workflows/publish.yml b/.github/workflows/publish.yml new file mode 100644 index 000000000..7526de0f1 --- /dev/null +++ b/.github/workflows/publish.yml @@ -0,0 +1,258 @@ +# Copyright (c) Microsoft Corporation. All rights reserved. +# Licensed under the MIT license. + +# Workflow for publishing crates to crates.io +# +# This workflow is triggered when a version tag (e.g., v0.46.0) is pushed. +# It will automatically publish all workspace crates to crates.io in the correct dependency order. +# +# Prerequisites: +# 1. CRATES_IO_TOKEN secret must be set in repository settings +# 2. Version numbers must be updated in Cargo.toml before tagging +# 3. Tag format: v{major}.{minor}.{patch} (e.g., v0.46.0) + +name: Publish to crates.io + +on: + push: + tags: + - 'v[0-9]+.[0-9]+.[0-9]+' + workflow_dispatch: + inputs: + dry_run: + description: 'Run in dry-run mode (test without actually publishing)' + required: false + default: 'true' + type: choice + options: + - 'true' + - 'false' + +env: + RUST_BACKTRACE: 1 + rust_stable: "1.92" + +defaults: + run: + shell: bash + +permissions: + contents: read + +jobs: + publish: + name: ${{ github.event.inputs.dry_run == 'true' && 'Dry-run publish test' || 'Publish crates to crates.io' }} + runs-on: ubuntu-latest + + steps: + - name: Checkout repository + uses: actions/checkout@v4 + with: + lfs: true + + - name: Install Rust ${{ env.rust_stable }} + uses: dtolnay/rust-toolchain@stable + with: + toolchain: ${{ env.rust_stable }} + + - uses: Swatinem/rust-cache@v2 + + - name: Verify version matches tag + if: github.event_name == 'push' + run: | + TAG_VERSION="${GITHUB_REF#refs/tags/v}" + CARGO_VERSION=$(grep '^version = ' Cargo.toml | head -n1 | sed 's/.*"\(.*\)".*/\1/') + echo "Tag version: $TAG_VERSION" + echo "Cargo version: $CARGO_VERSION" + if [ "$TAG_VERSION" != "$CARGO_VERSION" ]; then + echo "Error: Tag version ($TAG_VERSION) does not match Cargo.toml version ($CARGO_VERSION)" + exit 1 + fi + + - name: Run tests + run: | + cargo test --locked --workspace --profile ci + + - name: Publish crates in dependency order + env: + CARGO_REGISTRY_TOKEN: ${{ secrets.CRATES_IO_TOKEN }} + DRY_RUN: ${{ github.event.inputs.dry_run || 'false' }} + run: | + set -euo pipefail + + # Determine dry-run flag + DRY_RUN_FLAG="" + if [ "$DRY_RUN" = "true" ]; then + DRY_RUN_FLAG="--dry-run" + echo "==========================================" + echo "🧪 DRY-RUN MODE: No actual publishing" + echo "==========================================" + else + echo "==========================================" + echo "📦 LIVE MODE: Publishing to crates.io" + echo "==========================================" + fi + + # Function to publish a crate with retries + publish_crate() { + local crate=$1 + local max_attempts=3 + local attempt=1 + + if [ "$DRY_RUN" = "true" ]; then + echo "Publishing $crate (dry-run)..." + else + echo "Publishing $crate..." + fi + + while [ $attempt -le $max_attempts ]; do + if cargo publish --locked --package "$crate" $DRY_RUN_FLAG; then + if [ "$DRY_RUN" = "true" ]; then + echo "✓ Dry-run successful for $crate" + else + echo "✓ Successfully published $crate" + fi + return 0 + else + echo "✗ Attempt $attempt failed for $crate" + if [ $attempt -lt $max_attempts ]; then + echo "Retrying in 10 seconds..." + sleep 10 + fi + attempt=$((attempt + 1)) + fi + done + + echo "Failed to publish $crate after $max_attempts attempts" + return 1 + } + + # Wait for crate to be available on crates.io (skip in dry-run) + wait_for_crate() { + local crate=$1 + local version=$2 + + # Skip waiting in dry-run mode + if [ "$DRY_RUN" = "true" ]; then + echo "⏭️ Skipping availability check in dry-run mode" + return 0 + fi + + local max_wait=300 # 5 minutes + local elapsed=0 + + echo "Waiting for $crate@$version to be available on crates.io..." + + while [ $elapsed -lt $max_wait ]; do + if cargo search "$crate" --limit 1 | grep -q "$crate = \"$version\""; then + echo "✓ $crate@$version is now available" + return 0 + fi + sleep 10 + elapsed=$((elapsed + 10)) + done + + echo "Warning: $crate@$version not found on crates.io after ${max_wait}s" + return 1 + } + + VERSION=$(grep '^version = ' Cargo.toml | head -n1 | sed 's/.*"\(.*\)".*/\1/') + + # Publish in dependency order (base -> algorithm -> providers -> infrastructure) + + # Layer 1: Base libraries (no internal dependencies) + echo "=== Publishing Layer 1: Base libraries ===" + publish_crate "diskann-wide" + wait_for_crate "diskann-wide" "$VERSION" + + # Layer 2: Libraries depending only on diskann-wide + echo "=== Publishing Layer 2: Vector and platform libraries ===" + publish_crate "diskann-vector" + wait_for_crate "diskann-vector" "$VERSION" + + publish_crate "diskann-platform" + wait_for_crate "diskann-platform" "$VERSION" + + # Layer 3: Libraries depending on vector/wide + echo "=== Publishing Layer 3: Linalg and utils ===" + publish_crate "diskann-linalg" + wait_for_crate "diskann-linalg" "$VERSION" + + publish_crate "diskann-utils" + wait_for_crate "diskann-utils" "$VERSION" + + # Layer 4: Quantization (depends on utils, vector, linalg) + echo "=== Publishing Layer 4: Quantization ===" + publish_crate "diskann-quantization" + wait_for_crate "diskann-quantization" "$VERSION" + + # Layer 5: Core algorithm (depends on quantization, utils, vector, wide) + echo "=== Publishing Layer 5: Core algorithm ===" + publish_crate "diskann" + wait_for_crate "diskann" "$VERSION" + + # Layer 6: Providers (depends on diskann and others) + echo "=== Publishing Layer 6: Providers ===" + publish_crate "diskann-providers" + wait_for_crate "diskann-providers" "$VERSION" + + publish_crate "diskann-disk" + wait_for_crate "diskann-disk" "$VERSION" + + publish_crate "diskann-label-filter" + wait_for_crate "diskann-label-filter" "$VERSION" + + # Layer 7: Infrastructure (benchmarks and tools) + echo "=== Publishing Layer 7: Infrastructure - Benchmark Runner ===" + publish_crate "diskann-benchmark-runner" + wait_for_crate "diskann-benchmark-runner" "$VERSION" + + echo "=== Publishing Layer 8: Infrastructure - Benchmark SIMD and Core ===" + publish_crate "diskann-benchmark-simd" + wait_for_crate "diskann-benchmark-simd" "$VERSION" + + publish_crate "diskann-benchmark-core" + wait_for_crate "diskann-benchmark-core" "$VERSION" + + echo "=== Publishing Layer 9: Infrastructure - Tools ===" + publish_crate "diskann-tools" + wait_for_crate "diskann-tools" "$VERSION" + + echo "=== Publishing Layer 10: Infrastructure - Benchmark ===" + publish_crate "diskann-benchmark" + wait_for_crate "diskann-benchmark" "$VERSION" + + if [ "$DRY_RUN" = "true" ]; then + echo "==========================================" + echo "✅ Dry-run completed successfully!" + echo "All crates passed validation." + echo "==========================================" + else + echo "==========================================" + echo "✅ All crates published successfully!" + echo "==========================================" + fi + + - name: Create release notes + if: github.event_name == 'push' + run: | + TAG_VERSION="${GITHUB_REF#refs/tags/}" + echo "Release $TAG_VERSION completed successfully" > release-notes.txt + echo "" >> release-notes.txt + echo "Published crates:" >> release-notes.txt + echo "- diskann-wide" >> release-notes.txt + echo "- diskann-vector" >> release-notes.txt + echo "- diskann-linalg" >> release-notes.txt + echo "- diskann-utils" >> release-notes.txt + echo "- diskann-quantization" >> release-notes.txt + echo "- diskann-platform" >> release-notes.txt + echo "- diskann" >> release-notes.txt + echo "- diskann-providers" >> release-notes.txt + echo "- diskann-disk" >> release-notes.txt + echo "- diskann-label-filter" >> release-notes.txt + echo "- diskann-benchmark-core" >> release-notes.txt + echo "- diskann-benchmark-runner" >> release-notes.txt + echo "- diskann-benchmark-simd" >> release-notes.txt + echo "- diskann-benchmark" >> release-notes.txt + echo "- diskann-tools" >> release-notes.txt + cat release-notes.txt diff --git a/README.md b/README.md index 4dc168d2b..644f07cb8 100644 --- a/README.md +++ b/README.md @@ -22,6 +22,12 @@ contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additio See [guidelines](CONTRIBUTING.md) for contributing to this project. +## Releases + +The Rust crates are automatically published to [crates.io](https://crates.io) when a version tag is pushed. See [RELEASING.md](RELEASING.md) for detailed release process documentation. + +Latest version: ![Crates.io Version](https://img.shields.io/crates/v/diskann) + ## Legacy C++ Code Older C++ code is retained on the `cpp_main` branch, but is not actively developed or maintained. diff --git a/RELEASING.md b/RELEASING.md new file mode 100644 index 000000000..5de0f6426 --- /dev/null +++ b/RELEASING.md @@ -0,0 +1,274 @@ +# Release Process + +This document describes the automated release process for publishing DiskANN crates to [crates.io](https://crates.io). + +## Overview + +The DiskANN workspace consists of multiple crates that are published together with synchronized version numbers. The release process is automated through GitHub Actions and is triggered by pushing a version tag. + +## Testing the Release Process + +**Before publishing a real release**, you should test the workflow in dry-run mode to ensure everything works correctly. + +### Dry-Run Test + +The publish workflow can be manually triggered with a dry-run mode that validates everything without actually publishing to crates.io: + +1. **Navigate to GitHub Actions** + - Go to the repository on GitHub + - Click on the "Actions" tab + - Select "Publish to crates.io" from the left sidebar + +2. **Run Workflow Manually** + - Click "Run workflow" button on the right + - Select the branch (usually `main` or your release branch) + - Keep "Run in dry-run mode" set to **true** (default) + - Click "Run workflow" + +3. **Monitor the Dry-Run** + - Watch the workflow execution + - It will: + - ✓ Install Rust and dependencies + - ✓ Run the full test suite + - ✓ Validate all 15 crates with `cargo publish --dry-run` + - ✓ Check the dependency order + - ✗ NOT actually publish anything to crates.io + +4. **Verify Results** + - All steps should pass with green checkmarks + - Look for the message: "✅ Dry-run completed successfully!" + - Review any warnings or errors + +**Tip**: Run the dry-run test before creating your release tag to catch issues early. + +## Automated Publishing + +### Prerequisites + +1. **CRATES_IO_TOKEN Secret**: A crates.io API token must be configured as a GitHub repository secret named `CRATES_IO_TOKEN`. This token should have publish permissions for all DiskANN crates. + - To create a token: Visit [crates.io/settings/tokens](https://crates.io/settings/tokens) + - To add the secret: Go to repository Settings → Secrets and variables → Actions → New repository secret + +2. **Maintainer Access**: You must have write access to the repository and be an owner/maintainer of all the crates on crates.io. + +### Release Steps + +1. **Update Version Numbers** + + Update the version in the root `Cargo.toml` workspace configuration: + + ```toml + [workspace.package] + version = "0.46.0" # Update this version + ``` + + All workspace crates use `version.workspace = true`, so this single change updates all crates. + +2. **Update CHANGELOG** (if one exists) + + Document the changes, new features, bug fixes, and breaking changes in the release. + +3. **Test with Dry-Run** (Recommended) + + Before creating the tag, test the release process: + - Commit your version update to a branch + - Run the manual workflow in dry-run mode (see "Testing the Release Process" above) + - Verify all crates pass validation + +4. **Create and Push a Version Tag** + + ```bash + # Create a tag matching the version + git tag v0.46.0 + + # Push the tag to trigger the release workflow + git push origin v0.46.0 + ``` + + **Important**: The tag format must be `v{major}.{minor}.{patch}` (e.g., `v0.46.0`). + +5. **Monitor the Release Workflow** + + - Navigate to the Actions tab in the GitHub repository + - Find the "Publish to crates.io" workflow run + - Monitor the progress and check for any errors + + The workflow will: + - Verify the tag version matches `Cargo.toml` + - Run the test suite + - Publish crates in dependency order + - Wait for each crate to be available on crates.io before publishing dependents + +5. **Verify Publication** + + After the workflow completes, verify that all crates are published: + + ```bash + cargo search diskann --limit 20 + ``` + + Check that the new version appears for all crates. + +## Crate Dependency Order + +The crates are published in the following order to respect dependencies: + +1. **Layer 1 - Base**: `diskann-wide` +2. **Layer 2 - Vector and Platform**: `diskann-vector`, `diskann-platform` +3. **Layer 3 - Linalg and Utils**: `diskann-linalg`, `diskann-utils` +4. **Layer 4 - Quantization**: `diskann-quantization` +5. **Layer 5 - Core Algorithm**: `diskann` +6. **Layer 6 - Providers**: `diskann-providers`, `diskann-disk`, `diskann-label-filter` +7. **Layer 7 - Benchmark Runner**: `diskann-benchmark-runner` +8. **Layer 8 - Benchmark Support**: `diskann-benchmark-simd`, `diskann-benchmark-core` +9. **Layer 9 - Tools**: `diskann-tools` +10. **Layer 10 - Benchmark**: `diskann-benchmark` + +## Troubleshooting + +### Workflow Fails on Version Mismatch + +**Error**: "Tag version does not match Cargo.toml version" + +**Solution**: Ensure the tag version (without the 'v' prefix) exactly matches the version in `Cargo.toml`. + +### Publication Fails for a Crate + +**Error**: "failed to publish crate" + +**Possible Causes**: +- Network issues (retry automatically handled) +- Dependency not yet available on crates.io (wait time automatically handled) +- Authentication issues (check CRATES_IO_TOKEN secret) +- Version already published (you cannot republish the same version) + +**Solution**: +- Check the workflow logs for specific error messages +- Verify the CRATES_IO_TOKEN secret is valid and has the correct permissions +- If a crate failed midway, you may need to manually publish remaining crates + +### Manual Publishing + +If the automated workflow fails and you need to publish manually: + +```bash +# Set your crates.io token +export CARGO_REGISTRY_TOKEN="your-token-here" + +# Publish in dependency order +cargo publish --package diskann-wide +cargo publish --package diskann-vector +cargo publish --package diskann-platform +cargo publish --package diskann-linalg +cargo publish --package diskann-utils +cargo publish --package diskann-quantization +cargo publish --package diskann +cargo publish --package diskann-providers +cargo publish --package diskann-disk +cargo publish --package diskann-label-filter +cargo publish --package diskann-benchmark-runner +cargo publish --package diskann-benchmark-simd +cargo publish --package diskann-benchmark-core +cargo publish --package diskann-tools +cargo publish --package diskann-benchmark +``` + +**Note**: Wait 30-60 seconds between each publish to ensure dependencies are available on crates.io. + +## Pre-release Checklist + +Before creating a release tag: + +- [ ] All CI checks pass on the main branch +- [ ] Version number is updated in `Cargo.toml` +- [ ] CHANGELOG is updated (if applicable) +- [ ] Documentation is up to date +- [ ] Breaking changes are clearly documented +- [ ] All tests pass locally: `cargo test --workspace` +- [ ] Code builds without warnings: `cargo build --workspace --release` +- [ ] **Dry-run workflow test passes successfully** + +## Dry-Run vs Live Mode + +### Dry-Run Mode (Testing) + +- **Purpose**: Validate the release process without actually publishing +- **How to trigger**: Manual workflow dispatch with `dry_run: true` +- **What it does**: + - ✓ Runs all tests + - ✓ Validates crate metadata and dependencies + - ✓ Checks that all crates can be packaged + - ✓ Verifies publish order + - ✗ Does NOT publish to crates.io + - ✗ Does NOT wait for crate availability + - ✗ Does NOT require version tag +- **Use case**: Testing changes to the workflow, validating a new release before tagging + +### Live Mode (Production) + +- **Purpose**: Actually publish crates to crates.io +- **How to trigger**: Push a version tag (e.g., `v0.46.0`) +- **What it does**: + - ✓ Validates tag matches version + - ✓ Runs all tests + - ✓ Publishes all 15 crates to crates.io + - ✓ Waits for each crate to be available before publishing dependents + - ✓ Creates release notes +- **Use case**: Official releases + +**Recommendation**: Always run a dry-run test before pushing a release tag, especially if: +- You've modified the workflow +- It's your first time releasing +- The version number changed significantly +- Dependencies have been updated + +## Security Considerations + +- **Token Security**: The CRATES_IO_TOKEN should be kept secure and rotated periodically +- **Version Control**: Once a version is published to crates.io, it cannot be unpublished (only yanked) +- **Dependency Updates**: Ensure all dependencies are up to date and free of known vulnerabilities + +## Rollback + +If a release needs to be rolled back: + +1. **Yank the Version** (if critical bug or security issue): + + You'll need to yank all 15 crates that were published. Use the following commands: + + ```bash + # Replace 0.46.0 with the version to yank + VERSION="0.46.0" + + # Yank all crates in reverse dependency order + cargo yank --vers $VERSION diskann-benchmark + cargo yank --vers $VERSION diskann-tools + cargo yank --vers $VERSION diskann-benchmark-core + cargo yank --vers $VERSION diskann-benchmark-simd + cargo yank --vers $VERSION diskann-benchmark-runner + cargo yank --vers $VERSION diskann-label-filter + cargo yank --vers $VERSION diskann-disk + cargo yank --vers $VERSION diskann-providers + cargo yank --vers $VERSION diskann + cargo yank --vers $VERSION diskann-quantization + cargo yank --vers $VERSION diskann-utils + cargo yank --vers $VERSION diskann-linalg + cargo yank --vers $VERSION diskann-platform + cargo yank --vers $VERSION diskann-vector + cargo yank --vers $VERSION diskann-wide + ``` + +2. **Publish a Patch Release** with the fix as soon as possible + +**Note**: Yanking prevents new projects from using the version, but doesn't affect existing users who have already downloaded it. + +## Questions or Issues + +If you encounter issues with the release process: + +1. Check the GitHub Actions workflow logs +2. Review this documentation +3. Open an issue in the repository with: + - The tag you were trying to release + - Relevant error messages from the workflow + - Steps you've already tried