-
Notifications
You must be signed in to change notification settings - Fork 605
Description
Problem Statement
Problem Description
When using SummarizingConversationManager with stream_async(), there's no way to notify users in real-time when context compression is happening.
Current Behavior
# In agent.py (lines 716-726)
except ContextWindowOverflowException as e:
self.conversation_manager.reduce_context(self, e=e) # Blocks for 30+ seconds
# ... retry event loopThe reduce_context() call is synchronous and blocking. During this time:
- No events are yielded to the async iterator
- The event loop is blocked
- Users see a frozen UI with no feedback
Expected Behavior
Users should receive streaming events during context compression:
conversation_compacting- when compression startsconversation_compacted- when compression completes
This allows frontend applications to show "Compacting conversation..." feedback to users during the 30+ second wait.
Proposed Solutions
Option 1: Hook-based Events (Minimal Change)
Add new hook events for context reduction:
# New hook events
class BeforeContextReductionEvent:
agent: Agent
exception: ContextWindowOverflowException
class AfterContextReductionEvent:
agent: Agent
original_message_count: int
compressed_message_count: int
removed_count: intThis allows users to subscribe to these events via HookProvider.
Option 2: Yield Events During Exception Handling
Modify _execute_event_loop_cycle to yield events during context reduction:
except ContextWindowOverflowException as e:
# Yield event before reduction
yield {"context_reduction": "starting", "message_count": len(self.messages)}
self.conversation_manager.reduce_context(self, e=e)
# Yield event after reduction
yield {"context_reduction": "completed", "new_message_count": len(self.messages)}
# Retry
async for event in self._execute_event_loop_cycle(...):
yield eventOption 3: Async reduce_context
Make reduce_context async and run summarization in a thread pool:
async def reduce_context_async(self, agent: Agent, e: Exception = None):
loop = asyncio.get_event_loop()
await loop.run_in_executor(None, super().reduce_context, agent, e)This would unblock the event loop, but requires significant API changes.
Use Case
We're building a financial research assistant that handles long conversations. When context overflow occurs, users wait 30+ seconds without any feedback. Showing "Compacting conversation..." would significantly improve UX.
Environment
- Python 3.13
- strands-agents SDK 1.22.0
- strands-agents-tools 0.2.19
Related Code
class NotifyingSummarizingConversationManager(SummarizingConversationManager):
"""Custom manager that emits events - but events can't be yielded during blocking."""
def __init__(self, event_queue: asyncio.Queue, ...):
self._event_queue = event_queue
def reduce_context(self, agent, e=None, **kwargs):
self._event_queue.put_nowait({"type": "conversation_compacting"})
super().reduce_context(agent, e=e, **kwargs) # Blocks 30s+
self._event_queue.put_nowait({"type": "conversation_compacted"})
# Problem: Both events are consumed AFTER blocking endsProposed Solution
No response
Use Case
Scenario: Financial Research Assistant with Long Conversations
We are building a financial research assistant that helps users analyze stocks, retrieve financial data, and generate investment insights. The conversations often become very long because:
- Multiple tool calls: Each query may trigger 5-10 tool calls (fetching income statements, analyst estimates, news, etc.)
- Rich context: Users ask follow-up questions that require understanding previous analysis
- Extended sessions: A single research session can last 20+ minutes with dozens of messages
The Problem:
When context overflow occurs and SummarizingConversationManager.reduce_context() is triggered:
- The summarization process takes 30+ seconds (calling the LLM to generate a summary)
- During this time, the UI appears frozen with no feedback
- Users may think the application crashed and refresh the page
What We Need:
A way to emit streaming events during context reduction so we can:
- Show "Compacting conversation..." message when reduction starts
- Display "Compaction complete (47 → 25 messages)" when it finishes
Impact:
This would significantly improve UX for any application using SummarizingConversationManager with stream_async(), especially chatbot interfaces where users expect real-time feedback.
Alternatives Solutions
No response
Additional Context
No response