Skip to content

[FEATURE] Graph - Interrupt - From Agent Node #1526

@pgrayy

Description

@pgrayy

Problem Statement

As part of #1478, we added support for raising interrupts from hooks inside a Graph execution. As a follow up, we now want to add support for raising interrupts from an Agent node (we will address multi-agent node interrupt support separately).

Proposed Solution

No response

Use Case

Users may have their Agent instances configured to raise interrupts (docs).

Alternatives Solutions

No response

Additional Context

No response


Implementation Requirements

Based on analysis of existing implementations (Swarm agent interrupts, Graph hook interrupts):

Technical Approach

Reference Implementations:

  • src/strands/multiagent/swarm.py - Working agent interrupt handling (lines 871-887, 678-704)
  • src/strands/multiagent/graph.py - Current hook-based interrupt handling (lines 606-626)

Files to Modify:

  • src/strands/multiagent/graph.py - Main implementation (~100-150 lines)
  • tests_integ/interrupts/multiagent/test_agent.py - Add Graph tests (~100-200 lines)
  • Possibly tests/strands/multiagent/test_graph.py for unit tests

Implementation Details

  1. Replace NotImplementedError (graph.py lines 923-931):
    Replace the current NotImplementedError with proper interrupt handling when agent_response.stop_reason == "interrupt"

  2. Preserve Agent State on Interrupt (following Swarm pattern):
    Store in _interrupt_state.context[node_id]:

    • activated: node.executor._interrupt_state.activated
    • interrupt_state: node.executor._interrupt_state.to_dict()
    • state: node.executor.state.get()
    • messages: node.executor.messages
  3. Handle Parallel Execution:

    • When multiple nodes execute in parallel and one raises interrupt, other nodes should complete (consistent with hook-based interrupt behavior)
    • Interrupted nodes should be tracked in state.interrupted_nodes
    • Completed nodes in same batch tracked in _interrupt_state.context["completed_nodes"]
  4. Resume from Interrupt:

    • Restore agent executor state from interrupt context
    • Resume with interrupt responses as input
    • Follow existing _execute_graph resume logic for interrupted nodes
  5. Update GraphNode.reset_executor_state() if needed:

    • Similar to SwarmNode.reset_executor_state(), restore from interrupt context when resuming

Acceptance Criteria

  • Agent nodes in Graph can raise interrupts via tool_context.interrupt() or hook-based interrupts on tools
  • When agent node raises interrupt, Graph returns with status=Status.INTERRUPTED and interrupts populated
  • User can respond to interrupts and resume Graph execution
  • Parallel execution: non-interrupted nodes complete while interrupted nodes wait for response
  • Session management: interrupt state is properly serialized/deserialized
  • Unit tests added in tests/strands/multiagent/test_graph.py
  • Integration tests added in tests_integ/interrupts/multiagent/test_agent.py
  • All existing tests continue to pass

Example Usage (Expected Behavior)

from strands import Agent, tool
from strands.multiagent import GraphBuilder
from strands.multiagent.base import Status
from strands.types.tools import ToolContext

@tool(name="weather_tool", context=True)
def weather_tool(tool_context: ToolContext) -> str:
    response = tool_context.interrupt("weather_approval", reason="need location")
    return f"Weather in {response}: sunny"

weather_agent = Agent(name="weather", tools=[weather_tool])

builder = GraphBuilder()
builder.add_node(weather_agent, "weather_agent")
graph = builder.build()

# First invocation - will be interrupted
result = graph("What is the weather?")
assert result.status == Status.INTERRUPTED
assert len(result.interrupts) == 1
assert result.interrupts[0].name == "weather_approval"

# Resume with response
responses = [
    {
        "interruptResponse": {
            "interruptId": result.interrupts[0].id,
            "response": "Seattle",
        },
    },
]
result = graph(responses)
assert result.status == Status.COMPLETED

Testing Strategy

  1. Basic agent interrupt: Single agent node raises interrupt, user responds, execution completes
  2. Parallel execution: Multiple nodes in parallel, one raises interrupt, others complete
  3. Multiple interrupts: Multiple agent nodes raise interrupts simultaneously
  4. Reject scenario: User rejects interrupt (e.g., cancels tool)
  5. Session persistence: Interrupt state survives session save/restore

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions