Skip to content

Conversation

@qj0r9j0vc2
Copy link
Member

Summary

This PR implements structured task supervision for the CipherBFT node, resolving the issue of missing coordinated shutdown for background workers.

Changes

  • New supervisor.rs module: Implements NodeSupervisor using tokio-util's TaskTracker and CancellationToken for:

    • spawn() - Regular task spawning with cancellation support
    • spawn_cancellable() - Tasks that handle their own cancellation
    • spawn_critical() - Tasks that trigger full shutdown on failure
    • shutdown() - Graceful shutdown with configurable timeout
  • Refactored node.rs:

    • Added run_with_supervisor() method for external supervisor control
    • Peer-connector task now runs under supervisor
    • Event loop respects cancellation token with biased select
  • Updated main.rs:

    • cmd_start() creates supervisor and sets up Ctrl+C signal handler
    • cmd_testnet_start() uses shared supervisor for all validators

Shutdown Order

  1. Stop accepting new network connections (Primary shutdown)
  2. Drain in-flight consensus rounds (via cancellation token)
  3. Flush pending storage writes (supervisor waits for tasks)
  4. Close database connections
  5. Exit

Test Plan

  • All 66 existing tests pass
  • 3 new supervisor unit tests added and passing:
    • test_supervisor_spawn_and_shutdown
    • test_supervisor_cancellation
    • test_critical_task_failure_triggers_shutdown

Closes #56

@qj0r9j0vc2 qj0r9j0vc2 self-assigned this Jan 20, 2026
@qj0r9j0vc2 qj0r9j0vc2 force-pushed the qj0r9j0vc2/fix-issue-56 branch from 977a23a to 381752a Compare January 20, 2026 01:19
@qj0r9j0vc2 qj0r9j0vc2 merged commit b63ac33 into main Jan 20, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Missing structured task supervision for background workers

2 participants