Array of request parameters
Batch options (maxConcurrent, stopOnError)
Results of each request (success or error)
const results = await client.batchChatCompletion([
{ model: 'openai/gpt-3.5-turbo', messages: [{ role: 'user', content: 'Q1' }] },
{ model: 'openai/gpt-3.5-turbo', messages: [{ role: 'user', content: 'Q2' }] }
], { maxConcurrent: 5, stopOnError: false });
results.forEach((result, idx) => {
if (result.success && result.response) {
console.log(`Request ${idx}: ${result.response.choices[0].message.content}`);
} else {
console.error(`Request ${idx} failed: ${result.error?.message}`);
}
});
Quickly checks if a model supports a specific feature
The model ID to check
The feature to check ('streaming', 'tools', 'vision', 'json')
True if the model supports the feature
Validates request parameters against model capabilities
The model ID
The parameters to validate
Validation result with errors and warnings
const validation = await client.validateParams('openai/gpt-3.5-turbo', {
messages: [{ role: 'user', content: 'Hello' }],
stream: true,
max_tokens: 5000
});
if (!validation.valid) {
console.error('Errors:', validation.errors);
}
if (validation.warnings?.length) {
console.warn('Warnings:', validation.warnings);
}
Intelligently truncates a list of messages to respect a token limit Always preserves the system message if it exists, and truncates from the beginning (FIFO)
The messages to truncate
Maximum token limit (approximate, based on 4 chars ≈ 1 token)
The truncated messages
Creates an instance of the OpenRouter client.
Client configuration.
Optional
baseURL?: stringOptional
timeout?: numberOptional
maxRetries?: numberOptional
retryDelay?: numberOptional
headers?: Record<string, string>Optional
logger?: LoggerOptional
logLevel?: LogLevelSends a chat completion request to the OpenRouter API.
Complete request parameters (model, messages, and all optional parameters).
The API response.
Sends a chat completion request with streaming to the OpenRouter API.
Complete request parameters (model, messages, and all optional parameters).
Response stream.
Sends a streaming chat completion request and returns an AsyncIterable. This method provides a more ergonomic API than createChatCompletionStream by converting the ReadableStream to AsyncIterable, allowing the use of for await...of.
Complete request parameters (model, messages, and all optional parameters).
AsyncIterable of response chunks.
Counts approximately the number of tokens in a text
Text to analyze
Approximate number of tokens
Counts tokens in an array of messages
Messages to analyze
Total approximate number of tokens
Retrieves information for a specific model
Model ID (format: "author/name")
Model information
Retrieves available endpoints for a specific model
Model ID (format: "author/name")
List of model endpoints
Retrieves complete model capabilities (supported parameters, modalities, limits) Combines data from getModel() and getModelEndpoints() for an overview
Model ID (e.g: 'anthropic/claude-3.5-sonnet')
Detailed model capabilities
Checks if an API key is valid
True if the key is valid
Retrieves statistics for a specific generation
The generation ID
The generation statistics
Retrieves detailed information for the API key (usage, limits, rate limits) Endpoint: GET /auth/key
Information about the API key
Static
parseParses tool_calls from a response and returns parsed arguments
Tool calls to parse
Tool calls with parsed arguments
Static
createCreates a message of type 'tool' to respond to a tool call
Tool call ID
Response content (can be a JSON stringified)
Optional
name: stringFunction name (optional)
Formatted message for the API
Static
createHelper to create a tool response message from any object result. Automatically serializes objects to JSON string.
The ID of the tool call this is responding to
The result object or string to include as content
Optional
name: stringOptional name of the tool function
A properly formatted ChatMessage for tool responses
Static
markMarks a message as cacheable for Anthropic Prompt Caching. Adds a "cache breakpoint" allowing to cache all prompt content up to this point.
Savings: 90% reduction on cached tokens (hit), 10% surcharge (miss). Ideal for: long system prompt, repeated context, multi-turn conversations.
⚠️ Note: OpenRouter does not yet return cache metrics (cache_read_input_tokens, cache_creation_input_tokens) even though caching works on Anthropic's side. You will save money but won't see detailed cache metrics in API responses.
Message to mark as cacheable
New message with cache_control added
Static
executeHelper to execute tool calls and create response messages
Tool calls to execute
Map of available functions
Formatted response messages
Executes multiple chat completion requests in parallel (batch) Automatically handles rate limiting and individual errors