Skip to content

Conversation

@roomote
Copy link
Contributor

@roomote roomote bot commented Jan 21, 2026

This PR attempts to address Issue #10872.

Problem

MCP tools that return image content (like Figma's get_screenshot tool) were returning (No response) because the processToolContent method in UseMcpToolTool.ts only handled text and resource content types, not image.

Solution

  • Modified processToolContent to return both text content and images separately
  • Updated executeToolAndProcessResult to:
    • Extract images from the tool result content
    • Convert image data to data URLs (handling both raw base64 and pre-formatted data URLs)
    • Pass images to task.say() and pushToolResult() following the same pattern as accessMcpResourceTool.ts

Testing

  • Added 4 new test cases for image handling scenarios:
    • Tool response with only image content
    • Tool response with both text and image content
    • Image data that's already formatted as a data URL
    • Multiple images in response

All 16 tests pass.

Feedback and guidance are welcome!


Important

Add image content support to MCP tool responses in UseMcpToolTool.ts and update tests for image handling.

  • Behavior:
    • processToolContent in UseMcpToolTool.ts now returns both text and images separately.
    • executeToolAndProcessResult extracts images, converts them to data URLs, and passes them to task.say() and pushToolResult().
  • Testing:
    • Added 4 test cases in useMcpToolTool.spec.ts for image handling scenarios: only image content, both text and image content, pre-formatted data URLs, and multiple images.
  • Misc:
    • Updated formatResponse.toolResult mock in useMcpToolTool.spec.ts to handle images.

This description was created by Ellipsis for bb488fe. You can customize this summary. It will automatically update as commits are pushed.

Fixes #10872

The processToolContent method in UseMcpToolTool.ts now handles image
content types from MCP protocol responses (e.g., Figma's get_screenshot).

Changes:
- Modified processToolContent to return both text and images
- Updated executeToolAndProcessResult to pass images to task.say() and
  pushToolResult()
- Added tests for image handling scenarios
@roomote
Copy link
Contributor Author

roomote bot commented Jan 21, 2026

Rooviewer Clock   See task on Roo Cloud

✅ PR Review Complete

Status: Looks good - Ready for merge

Tests Verified

All 16 tests pass, including 4 new image-handling test cases.

Key Findings

Area Status
Code correctness ✅ Passes
Pattern consistency ✅ Follows existing accessMcpResourceTool.ts patterns
Test coverage ✅ Comprehensive (4 new test cases)
Edge case handling ✅ Handles raw base64 and pre-formatted data URLs

Changes Summary

  • Modified processToolContent() to return text and images separately
  • Updated executeToolAndProcessResult() to pass images to task.say() and pushToolResult()
  • Added tests for: image-only responses, mixed content, pre-formatted URLs, multiple images

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

Copy link
Contributor Author

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Review Complete - Looks Good!

This PR correctly addresses Issue #10872 by adding image content support to MCP tool responses. The implementation is well-structured and follows the existing patterns in the codebase.

Summary of Changes

  1. Modified processToolContent() - Now returns both text and images separately instead of just concatenated text
  2. Updated executeToolAndProcessResult() - Properly handles extracted images and passes them to task.say() and pushToolResult()
  3. Added 4 comprehensive test cases for image handling scenarios

Code Quality

Follows existing patterns - The implementation mirrors accessMcpResourceTool.ts which handles images similarly
Good edge case handling - Handles both raw base64 and pre-formatted data URLs
Test coverage - All 16 tests pass, including 4 new image-specific tests
Clean implementation - Changes are well-contained and focused

Minor Observations (Non-blocking)

  1. MimeType validation: In accessMcpResourceTool.ts, image handling validates with item.mimeType?.startsWith("image"). The PR trusts item.type === "image" which is correct per the MCP spec, but servers could potentially send non-image data with type "image". This is a theoretical edge case.

  2. Empty mimeType/data check: The condition at line 271 (if (item.mimeType && item.data)) is good defensive coding - it silently skips malformed image content rather than crashing.

Overall, this is a clean, well-tested fix that solves the reported issue. Nice work! 🎉

@Sniper199999
Copy link

I cloned the PR, and its working as intended!
Image is getting fetched successfully (I tried fetching 1 image only, haven't tried fetching multiple images)
Text fetching is also working properly...
image

@Sniper199999
Copy link

Another quality of life improvement will be to show image thumbnails, instead of only showing the text: "[1 image(s) received ]".
When clicked on a Thumbnail, it should open in a new tab inside vscode to show the image.
Also currently there is no way for the chat agent to save the images automatically that it has received in the chat...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Triage

Development

Successfully merging this pull request may close these issues.

3 participants