Skip to content

Conversation

@dk67604
Copy link

@dk67604 dk67604 commented Jan 17, 2026

Add documentation for strands-vllm, a vLLM model provider for Strands Agents SDK with Token-In/Token-Out (TITO) support for agentic RL training.

  • Add vllm.md with installation, usage, and configuration
  • Update mkdocs.yml navigation

Credits:
https://github.com/horizon-rl/strands-sglang/

Description

This PR adds community documentation for strands-vllm, a vLLM model provider designed for agentic reinforcement learning training workflows.

Key features documented:

  • OpenAI-Compatible API: Uses vLLM's /v1/chat/completions endpoint with streaming
  • TITO (Token-In/Token-Out) Support: Captures prompt_token_ids and token_ids directly from vLLM - eliminates retokenization drift in RL training
  • Tool Call Validation: Hook-based validation to reject unknown tools and invalid JSON inputs (RL-friendly error feedback)
  • Agent Lightning Integration: Automatically adds token IDs to OpenTelemetry spans for RL training data extraction

Documentation includes:

  • Installation instructions
  • Basic agent usage example
  • Tool call validation with VLLMToolValidationHooks
  • Agent Lightning integration with VLLMTokenRecorder
  • RL training with TokenManager for building trajectories with loss masks
  • Complete configuration reference
  • Troubleshooting section

References:

Related Issues

Resolves #432
Related to #418 (strands-sglang - similar TITO provider pattern)

Type of Change

  • New content
  • Content update/revision
  • Structure/organization improvement
  • Typo/formatting fix
  • Bug fix
  • Other (please describe):

Checklist

  • I have read the CONTRIBUTING document
  • My changes follow the project's documentation style
  • I have tested the documentation locally using mkdocs serve
  • Links in the documentation are valid and working

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Add documentation for strands-vllm, a vLLM model provider for Strands Agents SDK
with Token-In/Token-Out (TITO) support for agentic RL training.

Features documented:
- OpenAI-compatible API integration with vLLM
- TITO support for capturing prompt_token_ids and token_ids
- Tool call validation hooks for RL-friendly error feedback
- Agent Lightning integration via OpenTelemetry spans
- TokenManager for building RL trajectories with loss masks

References:
- strands-vllm: https://github.com/agents-community/strands-vllm
- Agent Lightning: https://github.com/microsoft/agent-lightning

Closes strands-agents#432

The hook validates:

- **Tool name**: Must exist in agent's tool registry
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does strands not do both of these automatically?

Specifically I think we do check for tool-names; do we not handle JSONDecodeError correctly - ore are you doing more here?

agent = Agent(model=model, tools=[...], hooks=[VLLMToolValidationHooks()])
```

## Why TITO Matters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make sense here? (I'm not sure TBH).

It seems like if someone made it this far in the docs, they'd already know this - which makes me wonder if this should be moved up to the intro (but more concise) or removed from this doc.

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Community Package] strands-vllm: vLLM model provider with Token-In/Token-Out (TITO) support for agentic RL training

2 participants