An experiment to replicate Vercel's agents-md approach for Trigger.dev documentation.
TL;DR: We couldn't prove it works better than inline docs.
We spent days building and testing a compressed documentation approach for AI coding assistants, inspired by Vercel's @next/codemod agents-md tool.
- Compressed index (~1.2KB) injected into CLAUDE.md
- Local MDX files in
.trigger-docs/read on-demand - Instruction prompt to force retrieval-led reasoning
6 configurations against 8 Trigger.dev v4 API tasks:
| Rank | Configuration | Pass Rate | Turns | Efficiency |
|---|---|---|---|---|
| 1 | inline-docs | 93% | 2.5 | 37.2 |
| 2 | trigger-docs | 79% | 6.4 | 12.3 |
| 3 | mcp-only | 71% | 2.4 | 29.6 |
| 4 | skills-only | 70% | 8.1 | 8.6 |
| 5 | skills-mcp | 56% | 5.0 | 11.2 |
| 6 | baseline | 25% | 3.0 | 8.3 |
Inline docs won decisively. Simply pasting Trigger.dev's llms-full.txt into CLAUDE.md outperformed:
- Our compressed index approach (-14%)
- The official Trigger.dev MCP server (-22%)
- Claude Code skills (-23%)
- Combined skills+MCP (-37%)
- Overhead: Compressed approach adds ~6 turns to read MDX files
- Selection problem: Model must choose WHICH file to read (often wrong)
- MCP latency: Tool calls add time and the model doesn't query effectively
- Skill invocation: Skills add 5-6 turns of overhead before getting docs
trigger-docs/
├── docs/
│ └── STRATEGY-GUIDE.md # Framework for building agents-md tools
├── test-suite/
│ ├── fixtures/ # Test configurations (6 approaches)
│ ├── results/ # Raw evaluation data
│ ├── src/ # Test runner code
│ └── README.md # Test methodology
└── README.md # This file
We targeted Trigger.dev v4+ APIs likely absent from LLM training data:
| API | Why It's a Good Test |
|---|---|
batchTriggerAndWait() |
New batch API with Result object pattern |
wait.forToken() |
Token-based wait patterns |
metadata.parent |
Hierarchical metadata access |
debounce option |
Recently added triggering option |
queue() shared queues |
Cross-task queue definitions |
idempotencyKeys.create() |
Scoped idempotency patterns |
cd test-suite
pnpm install
pnpm test # Run all configurations
pnpm test:report # Generate comparison reportIf you have <1200 lines of documentation:
Just paste them inline.
The engineering effort to build compressed indexes, MCP servers, or skills may not be worth the quality loss.
- Does this work better on Sonnet/Opus vs Haiku?
- Is Next.js docs structured differently than Trigger.dev?
- Did Vercel test against different baseline conditions?
- Is ~800 lines of context really that expensive?
- Vercel Blog: agents-md Outperforms Skills
- Next.js agents-md Source
- Trigger.dev llms-full.txt
- Trigger.dev MCP Server
MIT