Skip to content

Streaming spool rewards, V2 stake index, epoch boundary optimizations#200

Merged
smcio merged 1 commit intodevfrom
fix/stake-pubkey-index-path
Feb 5, 2026
Merged

Streaming spool rewards, V2 stake index, epoch boundary optimizations#200
smcio merged 1 commit intodevfrom
fix/stake-pubkey-index-path

Conversation

@7layermagik
Copy link

@7layermagik 7layermagik commented Jan 31, 2026

Streaming spool rewards, stake index update, epoch boundary optimizations

17 files changed, +1835 / -940

This PR redesigns reward calculation and epoch boundary processing for significant performance improvements and memory savings. The core changes replace in-memory data structures with streaming I/O patterns and consolidate redundant AccountsDB scans.

The manifest seed migration (section 5) is the enabling prerequisite — by seeding all replay state into the state file at snapshot build time, the runtime no longer needs the manifest, which unlocks streaming directly from AccountsDB via the updated stake index.


1. Streaming Spool Reward Architecture

Files: pkg/rewards/spool.go, pkg/rewards/rewards.go, pkg/rewards/points.go

Replaces in-memory reward calculation with disk-backed streaming using temporary binary files ("spools"). Previously, all stake accounts and their computed rewards were held in memory simultaneously (~2GB+ on mainnet). Now, data flows through three sequential spool stages:

  • Points spool (104 bytes/record) — Phase 1 streams stake accounts from AccountsDB, computes points = effective_stake × earned_credits using 128-bit math, and writes results to disk. A single-writer goroutine pattern with a 10k buffered channel handles concurrent stake processing.
  • Temp spool (88 bytes/record) — Phase 2 replays the points spool sequentially, computes each staker's reward share from total points, splits by validator commission, and writes final rewards.
  • Partitioned spools (88 bytes/record) — Phase 3 reads the temp spool, assigns partition indices via SipHash13, and writes per-partition files for distribution.

All spools use 1MB buffered readers/writers. Memory usage is now O(workers) instead of O(total_stakes).


2. Updated Stake Pubkey Index

Files: pkg/accountsdb/index.go, pkg/global/global_ctx.go

Extends the stake pubkey index from a flat list of 32-byte pubkeys to 48-byte records with appendvec location hints:

Before: [pubkey:32]
After:  [header:8]["STKI" + version] [pubkey:32][fileId:8][offset:8] ...

The index is sorted by (FileId, Offset), enabling sequential appendvec reads during epoch boundary scans instead of random Pebble lookups. Hints are advisory — reads still go through Pebble for canonical location, so stale hints degrade gracefully to random I/O without correctness issues.

This eliminates the ~500MB global stake cache. Stake data is now streamed directly from the sorted index via StreamStakeAccounts.

Compaction: Periodic compaction at epoch boundaries deduplicates appended entries (triggered when entriesFlushedSinceCompact >= 1000), keeping the index file clean without rewriting on every boundary.


3. Single-Pass Epoch Boundary Scan

Files: pkg/replay/epoch.go

Consolidates three separate AccountsDB scans into one streaming pass:

  • Before: (1) stake history update, (2) epoch stakes calculation, (3) vote cache refresh — each scanning all ~1M stake accounts independently
  • After: Single scanStakesForEpochBoundary that accumulates all data in one pass, returning a BoundaryStakeScanResult with atomic counters for stake history and mutex-guarded maps for epoch stakes

The downstream functions (updateStakeHistorySysvar, updateEpochStakesAndRefreshVoteCache) now apply pre-computed results instead of scanning.


4. Reward Phase Collapse (3 → 2)

Files: pkg/rewards/rewards.go, pkg/replay/epoch.go

The points spool from Phase 1 eliminates the need to re-scan AccountsDB in Phase 2:

  • Before: 3 AccountsDB scans — (1) calculate points + assign partitions, (2) calculate rewards, (3) distribute
  • After: 1 AccountsDB scan + sequential file replay — (1) calculate points → write points spool, (2) replay points spool → compute rewards → write partitioned spools

This also enables correct numPartitions calculation from actual reward count after filtering, instead of estimating from total stakes.


5. Manifest Removal from Runtime

Files: pkg/snapshot/manifest_seed.go, pkg/state/state.go

Replay state is now fully self-contained in the state file. A new PopulateManifestSeed function copies all necessary manifest data (capitalization, inflation, blockhashes, epoch stakes, etc.) into the state file at snapshot build time.

  • Runtime replay no longer reads or parses the manifest file
  • Resume uses fresh Last* values instead of stale manifest data
  • State file schema updated with strict validation
  • ManifestEpochStakes cleared after first replayed slot to save memory

🤖 Generated with Claude Code

@7layermagik 7layermagik force-pushed the fix/stake-pubkey-index-path branch from 883aa84 to b448181 Compare January 31, 2026 19:53
- Remove runtime manifest dependency — seed replay state from state file
- Replace in-memory reward calculation with streaming spool architecture
- Remove global stake cache; stream directly from AccountsDB via stake index
- Stake index with appendvec FileId/Offset hints for sequential I/O
- Merge stake history + epoch stakes boundary scans into single pass
- Collapse rewards from 3 AccountsDB scan phases to 2 via points spool

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@7layermagik 7layermagik force-pushed the fix/stake-pubkey-index-path branch from b448181 to b00dca8 Compare February 1, 2026 05:42
@smcio smcio merged commit 1780e12 into dev Feb 5, 2026
1 check passed
@palmerlao palmerlao deleted the fix/stake-pubkey-index-path branch February 7, 2026 05:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants