feat: Add AI-driven testing and improve report generation #12

devload · 2025-10-24T05:48:58Z

🤖 AI Testing Features

This PR introduces AI-driven testing capabilities and fixes critical bugs in report generation.

New Features

AI Testing Strategy

AI Strategy for Goal-Oriented Testing: New AIStrategy class that uses Claude API for intelligent test exploration
Workspace-based Communication: File-based communication system via ai_workspace/ directory
Auto-responder Integration: Real-time AI decision making with watchdog file monitoring
Test Credentials Support: Pass login credentials and test scenarios via CLI

CLI Enhancements

--strategy ai: Enable AI-driven testing
--ai-goal: Natural language test goal
--ai-credentials: JSON-formatted test credentials
--ai-scenario: Predefined scenario types (login, checkout, settings)
--runs: Multiple test run support with app restart

🐛 Bug Fixes

Critical: Rect Subscript Error (action.py)

Problem: 'Rect' object is not subscriptable error prevented report.json from being saved

Root Cause: Attempted to access bounds[0] on a Rect object that doesn't support subscript notation

Solution:

Changed to property access: bounds.left, bounds.top, bounds.right, bounds.bottom
Added both formats for compatibility:
- bounds: {left, top, right, bottom} (Rect properties)
- rect: {x1, y1, x2, y2} (legacy format)

Files: smartmonkey/exploration/action.py:103-115

Enhanced Error Handling (report_generator.py)

File existence and size verification after save
Separated try-catch for index.json updates
Traceback printing for all exceptions
Detailed logging with file sizes

Files: smartmonkey/reporting/report_generator.py:182-208

✨ Enhancements

App Restart Between Runs

Dynamic launcher activity detection using dumpsys package
Automatic app restart for fresh test state
Configurable with --runs parameter

Files: smartmonkey/device/app_manager.py:88-117

Optimized AI Performance

Reduced timeout from 300s to 30s
Auto-responder responds in 1-2 seconds
Fast file-based communication

Enhanced Report Data

Detailed action information in JSON reports
Both bounds and rect formats for element data
Auto-update index.json for Grafana integration

📚 Documentation

Added comprehensive docs/AI_TESTING_GUIDE.md:

AI testing workflow explanation
Usage examples and scenarios
Strategy comparison (ai vs weighted vs random)
Troubleshooting guide

📊 Test Results

All changes tested successfully:

✅ report.json generation works (20.9 KB)
✅ No more Rect subscript errors
✅ Grafana integration verified
✅ AI testing with auto-responder validated
✅ Multi-run tests with app restart confirmed

🔄 Modified Files

Core Changes:

smartmonkey/cli/main.py: AI strategy integration, multi-run support
smartmonkey/device/app_manager.py: Dynamic launcher activity detection
smartmonkey/exploration/action.py: Fix Rect bug, add to_dict()
smartmonkey/reporting/report_generator.py: Enhanced error handling

New Files:

smartmonkey/ai/workspace_provider.py: AI workspace management
smartmonkey/exploration/strategies/ai_strategy.py: AI exploration strategy
auto_responder.py: File watcher and Claude API integration
docs/AI_TESTING_GUIDE.md: Complete AI testing documentation

🎯 Usage Examples

AI Testing:

python3 -m smartmonkey.cli.main run \
  --device emulator-5556 \
  --package your.app.package \
  --strategy ai \
  --ai-goal "로그인 화면에서 test@example.com으로 로그인하세요" \
  --ai-credentials '{"email":"test@example.com","password":"test123"}' \
  --steps 20 \
  --output ./reports/login_test

Non-AI Testing (recommended for most cases):

python3 -m smartmonkey.cli.main run \
  --device emulator-5556 \
  --package your.app.package \
  --strategy weighted \
  --steps 100 \
  --runs 3 \
  --output ./reports/exploration_test

⚠️ Breaking Changes

None - all changes are backward compatible.

📝 Notes

AI strategy requires auto_responder.py to be running in a separate terminal
Weighted strategy is recommended for general exploration (no AI needed)
AI strategy best for specific goal-oriented scenarios

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

## 🤖 AI Testing Features - Add AI strategy for goal-oriented testing - Implement workspace-based communication with Claude API - Add auto_responder for real-time AI decision making - Support for test credentials and scenario types ## 🔧 Bug Fixes - Fix 'Rect object is not subscriptable' error in action.py - Change from bounds[0] to bounds.left/top/right/bottom - Add both 'bounds' and 'rect' formats for compatibility - Improve report_generator.py error handling - Add file existence and size verification after save - Separate try-catch for index.json updates - Add traceback printing for all exceptions ## ✨ Enhancements - Add app restart functionality between test runs - Improve CLI with AI-specific options (--ai-goal, --ai-credentials) - Optimize AI timeout from 300s to 30s - Add detailed action data to JSON reports - Enhanced index.json auto-update for Grafana integration ## 📚 Documentation - Add comprehensive AI_TESTING_GUIDE.md ## 🔄 Modified Files - smartmonkey/cli/main.py: AI strategy integration, multi-run support - smartmonkey/device/app_manager.py: Dynamic launcher activity detection - smartmonkey/exploration/action.py: Fix Rect subscript bug, add to_dict() - smartmonkey/reporting/report_generator.py: Enhanced error handling ## ➕ New Files - smartmonkey/ai/workspace_provider.py: AI workspace management - smartmonkey/exploration/strategies/ai_strategy.py: AI exploration strategy - auto_responder.py: File watcher and Claude API integration - docs/AI_TESTING_GUIDE.md: Complete AI testing documentation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

## Key Improvements ### 1. Natural Scroll Distance - Changed scroll distance from 75% (1800px) to 30% (720px) of screen height - Start position: 65% from top (1560px) - proper bottom margin - End position: 35% from top (840px) - sufficient top margin - Mimics human-like single swipe gesture **Before**: Aggressive scroll from y=2100 to y=300 (1800px) **After**: Natural scroll from y=1560 to y=840 (720px) ### 2. SwipeAction Compatibility Enhancement - Added parameter aliases (start_x/start_y/end_x/end_y) alongside existing (x1/y1/x2/y2) - Prevents TypeError when using different naming conventions - Maintains backward compatibility ## Files Changed - `run_web_navigation_safe.py`: Implemented natural scroll parameters - `smartmonkey/exploration/action.py`: Added SwipeAction aliases ## Test Results - ✅ Natural scroll behavior verified (720px distance) - ✅ Proper margins maintained (35% top and bottom) - ✅ Human-like scrolling achieved - ✅ Two successful test runs completed 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

This commit introduces a complete web testing framework for SmartMonkey, enabling intelligent testing of mobile web applications using Chrome DevTools Protocol. 🌐 New Features: - Chrome DevTools Protocol integration for direct DOM inspection - Web navigation testing via 'smartmonkey web' command - Visual markers on screenshots (red crosshair for clicks, green→blue arrow for swipes) - Smart scrolling with automatic detection of off-screen elements - Overlay/modal detection and auto-close before scrolling - Initial page screenshot capture before any actions - Independent step counting (swipes count as separate steps) 📦 New Modules: - smartmonkey/device/chrome/ - ChromeDevice and ChromeManager for CDP communication - smartmonkey/exploration/html/ - HTML element parsing and state management - smartmonkey/cli/commands/web.py - Web navigation command implementation - bin/smartmonkey - Convenience CLI wrapper script 🔧 Key Improvements: - Conservative overlay detection using specific CSS selectors (prevents false positives) - Dual step counter system (current_step vs action_count) for proper counting - Screenshot annotation with PIL/Pillow for visual gesture tracking - Automatic URL bar height detection and element filtering - Retry logic for CDP connections with exponential backoff 📚 Documentation: - Updated README.md with web testing examples and parameters - Added 8 comprehensive docs in docs/ directory - Chrome integration guides and quick reference 🧪 Test Files: - test_web_integration.py, test_web_naver.py for validation - run_web_navigation_safe.py for safe testing workflow ✅ Testing: - Successfully tested on emulator-5554 with https://m.naver.com - 5+ test runs completed without CDP disconnection - Overlay detection working with conservative selectors 🎉 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>

…een bounds validation This release brings significant improvements to AI-driven web navigation testing, focusing on coordinate accuracy and intelligent element filtering. ✨ New Features: - AI metadata in JSON reports (reason, expected_effect, confidence) - Screen bounds validation to prevent off-screen clicks - Enhanced visual markers (50px radius, 8px line width) - Claude Code CLI integration for intelligent element selection 🐛 Bug Fixes: - Fixed browser chrome height calculation (now correctly ~202px) - Fixed coordinate transformation (CSS pixels → Physical pixels) - Fixed logo click hitting address bar issue - Prevented clicks on carousel and off-screen elements 📊 Performance Improvements: - Filter 30-60 off-screen elements per step - Accurate DPR (Device Pixel Ratio) detection - Proper viewport coordinate transformation 🔧 Technical Changes: - Refactored html_parser.py coordinate calculation logic - Added screen bounds checking in _get_coordinates() - Enhanced chrome_device.py click marker rendering - Added AI metadata fields to Action class 🎯 Testing Results: - Successfully tested 10-step AI navigation on Coupang.com - 484.9 seconds execution time - 5 unique states discovered - 3 URLs visited 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

## 🚀 Major Features ### AI-Driven Testing - ✅ Vision-based screen analysis using Claude Code CLI - ✅ Mission-oriented testing for apps and web - ✅ Smart popup/ad handling with context awareness - ✅ Hybrid coordinate precision (AI vision + UI hierarchy) - ✅ Auto-correction for system permission dialogs ### Multi-Mode Testing - ✅ Native mobile app testing (refactored CLI) - ✅ Web app testing with Chrome DevTools - ✅ Traditional weighted/random strategies ## 🔧 Technical Improvements ### AI Integration - Claude Code CLI integration (`smartmonkey/ai/claude_code_client.py`) - AI exploration strategy (`smartmonkey/exploration/strategies/ai_strategy.py`) - AI prompt templates for app and web testing - Permission dialog auto-correction using UI hierarchy ### Architecture - Refactored CLI commands (mobile, web, ai) - Enhanced device management with screenshot capabilities - Improved coordinate calculation and validation ## 📝 Documentation - Updated README with AI testing sections - Added comprehensive CLI parameter documentation - Updated roadmap and acknowledgments ## 🐛 Bug Fixes - Fixed permission dialog coordinate accuracy issues - Improved popup handling logic - Enhanced screen bounds validation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

## 🎨 Landing Page Features ### Design - Modern, responsive design with gradient accents - Mobile-first approach (fully responsive) - Smooth scrolling and animations - Professional color scheme ### Content - Hero section with AI-driven testing emphasis - Three main features (AI, Mobile, Web) - Quick start guide with code examples - Documentation links - Stats showcase - Clean footer with navigation ### Technical - Pure HTML5 + CSS3 (no dependencies) - Optimized for GitHub Pages - SEO-friendly meta tags - Accessible design ## 🚀 Deployment - Deploy to: https://devload.github.io/smartmonkey - Source: main branch / docs folder 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

## Features - 4 MCP tools: list_devices, run_ai_test, run_mobile_test, run_web_test - Background test execution with test_id tracking - Claude Desktop integration via stdio server - Comprehensive setup guide in docs/MCP_SETUP.md ## Implementation Details - Uses official mcp SDK (>=0.9.0) - Threading-based background test execution - Immediate test_id return for async operations - Results saved to ./reports/<test_id>/ directory ## TODO (deferred for future versions) - get_results tool: Retrieve test results and screenshots - stop_test tool: Gracefully stop running tests - get_logs tool: Real-time log streaming - Progress reporting: WebSocket-based progress updates 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

## Bug Fix - Fixed `handle_list_devices` to use correct method `get_devices()` instead of `list_devices()` - Tested with real devices (VIVO V2041, Samsung SM-A356N) ## New Documentation - Added `docs/MCP_TESTING.md` with comprehensive testing guide - Includes Python 3.10+ upgrade instructions - 4 testing methods: basic execution, MCP Inspector, Claude Desktop, JSON-RPC - Troubleshooting section with common issues ## Testing Results ✓ 4 MCP tools registered and working ✓ list_devices successfully detects 2 connected devices ✓ Server runs without errors in stdio mode 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

## 🔌 MCP (Model Context Protocol) Integration ### New Features - **Claude Desktop Integration**: Control SmartMonkey with natural language - **4 MCP Tools**: list_devices, run_ai_test, run_mobile_test, run_web_test - **Background Execution**: Async test runs with test_id tracking - **Comprehensive Documentation**: - docs/MCP_SETUP.md - Setup and configuration - docs/MCP_TESTING.md - Testing and troubleshooting ### Usage Examples ``` User: "List my Android devices" Claude: [Shows VIVO V2041, Samsung SM-A356N] User: "Test Coupang app, mission: browse products, 10 steps" Claude: [Runs AI test, returns test_id] ``` ## 📝 Documentation Updates ### README.md - Updated version badge: 0.2.1 - Added MCP badge - Enhanced MCP section with: - Python 3.10+ requirement notice - Detailed setup instructions - MCP tools table - Test results documentation - Links to MCP guides - Updated roadmap with v0.2.1 release - Updated Python requirement: 3.10+ (3.12+ recommended) ### Landing Page (docs/index.html) - Updated version badge: v0.2.1 - Added MCP feature card (highlight) - Updated hero pills: Added "🔌 MCP Support" - Changed stats: "4 MCP Tools" - Added documentation cards: - MCP Integration guide - MCP Testing guide - Updated subtitle: "Four powerful testing capabilities" ### pyproject.toml - Version: 0.1.0 → 0.2.1 - Python requirement: >=3.9 → >=3.10 - Updated description with MCP mention - Development status: Alpha → Beta - Python classifiers: 3.10, 3.11, 3.12 - Updated tool targets: py310, py311, py312 ## 📖 CHANGELOG.md Created comprehensive changelog: - [0.2.1] - 2025-11-06 (MCP Integration) - [0.2.0] - 2025-11-03 (AI-Driven Testing) - [0.1.0] - 2025-10-23 (Initial Release) - Upgrade guides (0.2.0→0.2.1, 0.1.0→0.2.0) - Future roadmap (v0.3.0, v0.4.0+) ## ⚠️ Breaking Changes **Python Version**: Now requires Python 3.10+ (previously 3.9+) - MCP SDK requires Python 3.10 or higher - Recommended: Python 3.12 for best compatibility ## 🧪 Tested On - ✅ Python 3.12.12 - ✅ MCP SDK 1.20.0 - ✅ Real devices: VIVO V2041, Samsung SM-A356N 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

devload and others added 7 commits October 24, 2025 14:48

Add .nojekyll to disable Jekyll processing for landing page

5e6b260

devload temporarily deployed to github-pages November 5, 2025 04:53 — with GitHub Pages Inactive

Improve GitHub button visibility with border and shadow

efcc54d

devload temporarily deployed to github-pages November 5, 2025 04:56 — with GitHub Pages Inactive

Add Open Graph image with monkey emoji for SNS sharing

00ac2c1

devload temporarily deployed to github-pages November 5, 2025 05:05 — with GitHub Pages Inactive

Replace SVG with PNG for SNS thumbnail compatibility

d2496ef

devload temporarily deployed to github-pages November 5, 2025 08:28 — with GitHub Pages Inactive

devload and others added 2 commits November 6, 2025 19:01

devload temporarily deployed to github-pages November 6, 2025 10:10 — with GitHub Pages Inactive

devload deployed to github-pages November 6, 2025 10:16 — with GitHub Pages View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add AI-driven testing and improve report generation #12

feat: Add AI-driven testing and improve report generation #12

Uh oh!

devload commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Add AI-driven testing and improve report generation #12

Are you sure you want to change the base?

feat: Add AI-driven testing and improve report generation #12

Uh oh!

Conversation

devload commented Oct 24, 2025

🤖 AI Testing Features

New Features

AI Testing Strategy

CLI Enhancements

🐛 Bug Fixes

Critical: Rect Subscript Error (action.py)

Enhanced Error Handling (report_generator.py)

✨ Enhancements

App Restart Between Runs

Optimized AI Performance

Enhanced Report Data

📚 Documentation

📊 Test Results

🔄 Modified Files

🎯 Usage Examples

⚠️ Breaking Changes

📝 Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants