Skip to content

Conversation

@devload
Copy link
Owner

@devload devload commented Oct 24, 2025

🤖 AI Testing Features

This PR introduces AI-driven testing capabilities and fixes critical bugs in report generation.

New Features

AI Testing Strategy

  • AI Strategy for Goal-Oriented Testing: New AIStrategy class that uses Claude API for intelligent test exploration
  • Workspace-based Communication: File-based communication system via ai_workspace/ directory
  • Auto-responder Integration: Real-time AI decision making with watchdog file monitoring
  • Test Credentials Support: Pass login credentials and test scenarios via CLI

CLI Enhancements

  • --strategy ai: Enable AI-driven testing
  • --ai-goal: Natural language test goal
  • --ai-credentials: JSON-formatted test credentials
  • --ai-scenario: Predefined scenario types (login, checkout, settings)
  • --runs: Multiple test run support with app restart

🐛 Bug Fixes

Critical: Rect Subscript Error (action.py)

Problem: 'Rect' object is not subscriptable error prevented report.json from being saved

Root Cause: Attempted to access bounds[0] on a Rect object that doesn't support subscript notation

Solution:

  • Changed to property access: bounds.left, bounds.top, bounds.right, bounds.bottom
  • Added both formats for compatibility:
    • bounds: {left, top, right, bottom} (Rect properties)
    • rect: {x1, y1, x2, y2} (legacy format)

Files: smartmonkey/exploration/action.py:103-115

Enhanced Error Handling (report_generator.py)

  • File existence and size verification after save
  • Separated try-catch for index.json updates
  • Traceback printing for all exceptions
  • Detailed logging with file sizes

Files: smartmonkey/reporting/report_generator.py:182-208

✨ Enhancements

App Restart Between Runs

  • Dynamic launcher activity detection using dumpsys package
  • Automatic app restart for fresh test state
  • Configurable with --runs parameter

Files: smartmonkey/device/app_manager.py:88-117

Optimized AI Performance

  • Reduced timeout from 300s to 30s
  • Auto-responder responds in 1-2 seconds
  • Fast file-based communication

Enhanced Report Data

  • Detailed action information in JSON reports
  • Both bounds and rect formats for element data
  • Auto-update index.json for Grafana integration

📚 Documentation

Added comprehensive docs/AI_TESTING_GUIDE.md:

  • AI testing workflow explanation
  • Usage examples and scenarios
  • Strategy comparison (ai vs weighted vs random)
  • Troubleshooting guide

📊 Test Results

All changes tested successfully:

  • ✅ report.json generation works (20.9 KB)
  • ✅ No more Rect subscript errors
  • ✅ Grafana integration verified
  • ✅ AI testing with auto-responder validated
  • ✅ Multi-run tests with app restart confirmed

🔄 Modified Files

Core Changes:

  • smartmonkey/cli/main.py: AI strategy integration, multi-run support
  • smartmonkey/device/app_manager.py: Dynamic launcher activity detection
  • smartmonkey/exploration/action.py: Fix Rect bug, add to_dict()
  • smartmonkey/reporting/report_generator.py: Enhanced error handling

New Files:

  • smartmonkey/ai/workspace_provider.py: AI workspace management
  • smartmonkey/exploration/strategies/ai_strategy.py: AI exploration strategy
  • auto_responder.py: File watcher and Claude API integration
  • docs/AI_TESTING_GUIDE.md: Complete AI testing documentation

🎯 Usage Examples

AI Testing:

python3 -m smartmonkey.cli.main run \
  --device emulator-5556 \
  --package your.app.package \
  --strategy ai \
  --ai-goal "로그인 화면에서 test@example.com으로 로그인하세요" \
  --ai-credentials '{"email":"test@example.com","password":"test123"}' \
  --steps 20 \
  --output ./reports/login_test

Non-AI Testing (recommended for most cases):

python3 -m smartmonkey.cli.main run \
  --device emulator-5556 \
  --package your.app.package \
  --strategy weighted \
  --steps 100 \
  --runs 3 \
  --output ./reports/exploration_test

⚠️ Breaking Changes

None - all changes are backward compatible.

📝 Notes

  • AI strategy requires auto_responder.py to be running in a separate terminal
  • Weighted strategy is recommended for general exploration (no AI needed)
  • AI strategy best for specific goal-oriented scenarios

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

devload and others added 7 commits October 24, 2025 14:48
## 🤖 AI Testing Features
- Add AI strategy for goal-oriented testing
- Implement workspace-based communication with Claude API
- Add auto_responder for real-time AI decision making
- Support for test credentials and scenario types

## 🔧 Bug Fixes
- Fix 'Rect object is not subscriptable' error in action.py
  - Change from bounds[0] to bounds.left/top/right/bottom
  - Add both 'bounds' and 'rect' formats for compatibility
- Improve report_generator.py error handling
  - Add file existence and size verification after save
  - Separate try-catch for index.json updates
  - Add traceback printing for all exceptions

## ✨ Enhancements
- Add app restart functionality between test runs
- Improve CLI with AI-specific options (--ai-goal, --ai-credentials)
- Optimize AI timeout from 300s to 30s
- Add detailed action data to JSON reports
- Enhanced index.json auto-update for Grafana integration

## 📚 Documentation
- Add comprehensive AI_TESTING_GUIDE.md

## 🔄 Modified Files
- smartmonkey/cli/main.py: AI strategy integration, multi-run support
- smartmonkey/device/app_manager.py: Dynamic launcher activity detection
- smartmonkey/exploration/action.py: Fix Rect subscript bug, add to_dict()
- smartmonkey/reporting/report_generator.py: Enhanced error handling

## ➕ New Files
- smartmonkey/ai/workspace_provider.py: AI workspace management
- smartmonkey/exploration/strategies/ai_strategy.py: AI exploration strategy
- auto_responder.py: File watcher and Claude API integration
- docs/AI_TESTING_GUIDE.md: Complete AI testing documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
## Key Improvements

### 1. Natural Scroll Distance
- Changed scroll distance from 75% (1800px) to 30% (720px) of screen height
- Start position: 65% from top (1560px) - proper bottom margin
- End position: 35% from top (840px) - sufficient top margin
- Mimics human-like single swipe gesture

**Before**: Aggressive scroll from y=2100 to y=300 (1800px)
**After**: Natural scroll from y=1560 to y=840 (720px)

### 2. SwipeAction Compatibility Enhancement
- Added parameter aliases (start_x/start_y/end_x/end_y) alongside existing (x1/y1/x2/y2)
- Prevents TypeError when using different naming conventions
- Maintains backward compatibility

## Files Changed
- `run_web_navigation_safe.py`: Implemented natural scroll parameters
- `smartmonkey/exploration/action.py`: Added SwipeAction aliases

## Test Results
- ✅ Natural scroll behavior verified (720px distance)
- ✅ Proper margins maintained (35% top and bottom)
- ✅ Human-like scrolling achieved
- ✅ Two successful test runs completed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit introduces a complete web testing framework for SmartMonkey, enabling
intelligent testing of mobile web applications using Chrome DevTools Protocol.

🌐 New Features:
- Chrome DevTools Protocol integration for direct DOM inspection
- Web navigation testing via 'smartmonkey web' command
- Visual markers on screenshots (red crosshair for clicks, green→blue arrow for swipes)
- Smart scrolling with automatic detection of off-screen elements
- Overlay/modal detection and auto-close before scrolling
- Initial page screenshot capture before any actions
- Independent step counting (swipes count as separate steps)

📦 New Modules:
- smartmonkey/device/chrome/ - ChromeDevice and ChromeManager for CDP communication
- smartmonkey/exploration/html/ - HTML element parsing and state management
- smartmonkey/cli/commands/web.py - Web navigation command implementation
- bin/smartmonkey - Convenience CLI wrapper script

🔧 Key Improvements:
- Conservative overlay detection using specific CSS selectors (prevents false positives)
- Dual step counter system (current_step vs action_count) for proper counting
- Screenshot annotation with PIL/Pillow for visual gesture tracking
- Automatic URL bar height detection and element filtering
- Retry logic for CDP connections with exponential backoff

📚 Documentation:
- Updated README.md with web testing examples and parameters
- Added 8 comprehensive docs in docs/ directory
- Chrome integration guides and quick reference

🧪 Test Files:
- test_web_integration.py, test_web_naver.py for validation
- run_web_navigation_safe.py for safe testing workflow

✅ Testing:
- Successfully tested on emulator-5554 with https://m.naver.com
- 5+ test runs completed without CDP disconnection
- Overlay detection working with conservative selectors

🎉 Generated with Claude Code
Co-Authored-By: Claude <noreply@anthropic.com>
…een bounds validation

This release brings significant improvements to AI-driven web navigation testing,
focusing on coordinate accuracy and intelligent element filtering.

✨ New Features:
- AI metadata in JSON reports (reason, expected_effect, confidence)
- Screen bounds validation to prevent off-screen clicks
- Enhanced visual markers (50px radius, 8px line width)
- Claude Code CLI integration for intelligent element selection

🐛 Bug Fixes:
- Fixed browser chrome height calculation (now correctly ~202px)
- Fixed coordinate transformation (CSS pixels → Physical pixels)
- Fixed logo click hitting address bar issue
- Prevented clicks on carousel and off-screen elements

📊 Performance Improvements:
- Filter 30-60 off-screen elements per step
- Accurate DPR (Device Pixel Ratio) detection
- Proper viewport coordinate transformation

🔧 Technical Changes:
- Refactored html_parser.py coordinate calculation logic
- Added screen bounds checking in _get_coordinates()
- Enhanced chrome_device.py click marker rendering
- Added AI metadata fields to Action class

🎯 Testing Results:
- Successfully tested 10-step AI navigation on Coupang.com
- 484.9 seconds execution time
- 5 unique states discovered
- 3 URLs visited

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
## 🚀 Major Features

### AI-Driven Testing
- ✅ Vision-based screen analysis using Claude Code CLI
- ✅ Mission-oriented testing for apps and web
- ✅ Smart popup/ad handling with context awareness
- ✅ Hybrid coordinate precision (AI vision + UI hierarchy)
- ✅ Auto-correction for system permission dialogs

### Multi-Mode Testing
- ✅ Native mobile app testing (refactored CLI)
- ✅ Web app testing with Chrome DevTools
- ✅ Traditional weighted/random strategies

## 🔧 Technical Improvements

### AI Integration
- Claude Code CLI integration (`smartmonkey/ai/claude_code_client.py`)
- AI exploration strategy (`smartmonkey/exploration/strategies/ai_strategy.py`)
- AI prompt templates for app and web testing
- Permission dialog auto-correction using UI hierarchy

### Architecture
- Refactored CLI commands (mobile, web, ai)
- Enhanced device management with screenshot capabilities
- Improved coordinate calculation and validation

## 📝 Documentation
- Updated README with AI testing sections
- Added comprehensive CLI parameter documentation
- Updated roadmap and acknowledgments

## 🐛 Bug Fixes
- Fixed permission dialog coordinate accuracy issues
- Improved popup handling logic
- Enhanced screen bounds validation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
## 🎨 Landing Page Features

### Design
- Modern, responsive design with gradient accents
- Mobile-first approach (fully responsive)
- Smooth scrolling and animations
- Professional color scheme

### Content
- Hero section with AI-driven testing emphasis
- Three main features (AI, Mobile, Web)
- Quick start guide with code examples
- Documentation links
- Stats showcase
- Clean footer with navigation

### Technical
- Pure HTML5 + CSS3 (no dependencies)
- Optimized for GitHub Pages
- SEO-friendly meta tags
- Accessible design

## 🚀 Deployment
- Deploy to: https://devload.github.io/smartmonkey
- Source: main branch / docs folder

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
devload and others added 2 commits November 6, 2025 19:01
## Features
- 4 MCP tools: list_devices, run_ai_test, run_mobile_test, run_web_test
- Background test execution with test_id tracking
- Claude Desktop integration via stdio server
- Comprehensive setup guide in docs/MCP_SETUP.md

## Implementation Details
- Uses official mcp SDK (>=0.9.0)
- Threading-based background test execution
- Immediate test_id return for async operations
- Results saved to ./reports/<test_id>/ directory

## TODO (deferred for future versions)
- get_results tool: Retrieve test results and screenshots
- stop_test tool: Gracefully stop running tests
- get_logs tool: Real-time log streaming
- Progress reporting: WebSocket-based progress updates

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
## Bug Fix
- Fixed `handle_list_devices` to use correct method `get_devices()` instead of `list_devices()`
- Tested with real devices (VIVO V2041, Samsung SM-A356N)

## New Documentation
- Added `docs/MCP_TESTING.md` with comprehensive testing guide
- Includes Python 3.10+ upgrade instructions
- 4 testing methods: basic execution, MCP Inspector, Claude Desktop, JSON-RPC
- Troubleshooting section with common issues

## Testing Results
✓ 4 MCP tools registered and working
✓ list_devices successfully detects 2 connected devices
✓ Server runs without errors in stdio mode

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
## 🔌 MCP (Model Context Protocol) Integration

### New Features
- **Claude Desktop Integration**: Control SmartMonkey with natural language
- **4 MCP Tools**: list_devices, run_ai_test, run_mobile_test, run_web_test
- **Background Execution**: Async test runs with test_id tracking
- **Comprehensive Documentation**:
  - docs/MCP_SETUP.md - Setup and configuration
  - docs/MCP_TESTING.md - Testing and troubleshooting

### Usage Examples
```
User: "List my Android devices"
Claude: [Shows VIVO V2041, Samsung SM-A356N]

User: "Test Coupang app, mission: browse products, 10 steps"
Claude: [Runs AI test, returns test_id]
```

## 📝 Documentation Updates

### README.md
- Updated version badge: 0.2.1
- Added MCP badge
- Enhanced MCP section with:
  - Python 3.10+ requirement notice
  - Detailed setup instructions
  - MCP tools table
  - Test results documentation
  - Links to MCP guides
- Updated roadmap with v0.2.1 release
- Updated Python requirement: 3.10+ (3.12+ recommended)

### Landing Page (docs/index.html)
- Updated version badge: v0.2.1
- Added MCP feature card (highlight)
- Updated hero pills: Added "🔌 MCP Support"
- Changed stats: "4 MCP Tools"
- Added documentation cards:
  - MCP Integration guide
  - MCP Testing guide
- Updated subtitle: "Four powerful testing capabilities"

### pyproject.toml
- Version: 0.1.0 → 0.2.1
- Python requirement: >=3.9 → >=3.10
- Updated description with MCP mention
- Development status: Alpha → Beta
- Python classifiers: 3.10, 3.11, 3.12
- Updated tool targets: py310, py311, py312

## 📖 CHANGELOG.md

Created comprehensive changelog:
- [0.2.1] - 2025-11-06 (MCP Integration)
- [0.2.0] - 2025-11-03 (AI-Driven Testing)
- [0.1.0] - 2025-10-23 (Initial Release)
- Upgrade guides (0.2.0→0.2.1, 0.1.0→0.2.0)
- Future roadmap (v0.3.0, v0.4.0+)

## ⚠️ Breaking Changes

**Python Version**: Now requires Python 3.10+ (previously 3.9+)
- MCP SDK requires Python 3.10 or higher
- Recommended: Python 3.12 for best compatibility

## 🧪 Tested On

- ✅ Python 3.12.12
- ✅ MCP SDK 1.20.0
- ✅ Real devices: VIVO V2041, Samsung SM-A356N

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants