The Adaptive Model Routing System (AMRS) is a framework designed to select the best-fit model for exploration and exploitation. Rust core with python bindings. Still under active development 🚧.
AMRS builds on top of async-openai to provide API services for quick setup. Thanks to open source 💙.
-
Endpoints Support (only basic ones because of limited resources):
- Chat Completions
- Responses
- More on the way
-
Flexible Routing Strategies:
- Random(default): Randomly selects a model from the available models.
- WRR: Weighted Round Robin selects models based on predefined weights.
- UCB1: Upper Confidence Bound for balancing exploration and exploitation (coming soon).
- Adaptive: Dynamically selects models based on performance metrics (coming soon).
-
Various Providers Support:
- OpenAI compatible providers (OpenAI, DeepInfra, etc.)
- More on the way
Run the following Cargo command in your project directory:
cargo add arms
Or add the following line to your Cargo.toml:
arms = "0.0.1"
Here's a simple example with the Weighted Round Robin (WRR) routing mode. Before running the code, make sure to set your provider API key in the environment variable by running export <PROVIDER>_API_KEY="your_openai_api_key".
Here we use OpenAI as an example.
// Make sure OPENAI_API_KEY is set in your environment variables before running this code.
use arms::client;
use arms::types::chat;
use tokio::runtime::Runtime;
fn main() {
let config = client::Config::builder()
.provider("deepinfra")
.routing_mode(client::RoutingMode::WRR)
.model(
client::ModelConfig::builder()
.name("deepseek-ai/DeepSeek-V3.2")
.weight(2)
.build()
.unwrap(),
)
.model(
client::ModelConfig::builder()
.name("nvidia/Nemotron-3-Nano-30B-A3B")
.weight(1)
.build()
.unwrap(),
)
.build()
.unwrap();
let mut client = client::Client::new(config);
let request = chat::CreateChatCompletionRequestArgs::default()
.messages([
chat::ChatCompletionRequestSystemMessage::from("You are a helpful assistant.").into(),
chat::ChatCompletionRequestUserMessage::from("How long it takes to learn Rust?").into(),
])
.build()
.unwrap();
let result = Runtime::new()
.unwrap()
.block_on(client.create_completion(request));
match result {
Ok(response) => {
for choice in response.choices {
println!("Response: {:?}", choice.message.content);
}
}
Err(e) => {
eprintln!("Error: {}", e);
}
}
}See more examples here folder.
🚀 All kinds of contributions are welcomed ! Please follow Contributing.