Skip to content

Add native support for Vision-Language-Action (VLA) models on edge devices #17079

@bhu619

Description

@bhu619

🚀 The feature, motivation and pitch

I am currently working on deploying vision-language-action(VLA) models, such as OpenVLA, Pi-0, to edge devices for real-time robot control, and I plan to use ExecuTorch as the on-device deployment framework. However, it remains uncertain whether ExecuTorch can successfully export and support the execution of VLA models.

Therefore, I hope the ExecuTorch team can consider adding native support for VLA models, enabling the implementation of robotic applications with privacy protection and low latency on resource-constrained devices, such as mobile robots and drones. This would address a critical gap: while ExecuTorch already supports some VLMs and LLMs, it currently lacks support for the action generation module, which is essential for embodied intelligence.

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions