Model Router

With Model Router, you can access the most popular models with a single endpoint and bill. Experiment with new models and scale your app without worrying about the underlying infrastructure.

Setup

Getting started with Model Router is simple. Generate an API key and drop it into your favorite framework.

Generate API key

API keys for Model Router are generated within your workspace. Generate a key by logging into the console and navigating to the Model Router page.

Connect via framework

Model Router integrates easily into the most popular frameworks.

Model Router is a drop-in replacement for OpenAI’s API.

import openai

# Configure with your Hypermode API key and Hypermode Model Router base url
client = openai.OpenAI(
    api_key="<HYPERMODE_API_KEY>",
    base_url="https://models.hypermode.host/v1",
)

# Set up the request
response = client.chat.completions.create(
    model="meta-llama/llama-4-scout-17b-16e-instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is Dgraph?"},
    ],
    max_tokens=150,
    temperature=0.7,
)

# Print the response
print(response.choices[0].message.content)

Model Router is a drop-in replacement for OpenAI’s API.

import openai

# Configure with your Hypermode API key and Hypermode Model Router base url
client = openai.OpenAI(
    api_key="<HYPERMODE_API_KEY>",
    base_url="https://models.hypermode.host/v1",
)

# Set up the request
response = client.chat.completions.create(
    model="meta-llama/llama-4-scout-17b-16e-instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is Dgraph?"},
    ],
    max_tokens=150,
    temperature=0.7,
)

# Print the response
print(response.choices[0].message.content)

To use the Model Router in a Modus app, in your app manifest create a connection to the Model Router and add a model that references that connection.

modus.json
{
    ...
  "models": {
    "meta-llama-llama-4-scout-17b-16e-instruct": {
      "sourceModel": "meta-llama/llama-4-scout-17b-16e-instruct",
      "connection": "model-router",
      "path": "v1/chat/completions"
    },
  },
  "connections": {
    "model-router": {
      "type": "http",
      "baseUrl": "https://models.hypermode.host/",
      "headers": {
        "Authorization": "Bearer {{API_KEY}}"
      }
    }
  },
  ...
}

Connect directly via API

You can also access the API directly.

import requests
import json

# Your Hypermode API key
api_key = "<HYPERMODE_API_KEY>"

# Use the Hypermode Model Router base url
base_url = "https://models.hypermode.host"

# API endpoint
endpoint = f"{base_url}/v1/chat/completions"

# Headers
headers = {"Content-Type": "application/json", "Authorization": f"Bearer {api_key}"}

# Request payload
payload = {
    "model": "meta-llama/llama-4-scout-17b-16e-instruct",
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is Dgraph?"},
    ],
    "max_tokens": 150,
    "temperature": 0.7,
}

# Make the API request
response = requests.post(endpoint, headers=headers, data=json.dumps(payload))

# Check if the request was successful
if response.status_code == 200:
    # Parse and print the response
    response_data = response.json()
    print(response_data["choices"][0]["message"]["content"])
else:
    # Print error information
    print(f"Error: {response.status_code}")
    print(response.text)

Available models

Hypermode provides a variety of the most popular open source and commercial models.

We’re constantly evaluating model usage in determining new models to add to our catalog. Interested in using a model not listed here? Let us know at help@hypermode.com.

Generation

Large language models provide text generation and reasoning capabilities.

Provider	Model	Slug	Model Unit Multiplier
Meta	Llama 4 Scout	`meta-llama/llama-4-scout-17b-16e-instruct`	1
Meta	Llama 3.2	`meta-llama/llama-3.2-3b-instruct`	1
DeepSeek	DeepSeek-R1-Distill-Llama	`deepseek-ai/deepseek-r1-distill-llama-8b`	1
Anthropic	Claude 3.7 Sonnet	`claude-3-7-sonnet-20250219`	1
Anthropic	Claude 3.5 Sonnet	`claude-3-5-sonnet-20241022`	1
Anthropic	Claude 3.5 Haiku	`claude-3-5-haiku-20241022`	1
Anthropic	Claude 3.5 Sonnet	`claude-3-5-sonnet-20240620`	1
Anthropic	Claude 3.5 Haiku	`claude-3-5-haiku-20240307`	1
Anthropic	Claude 3.5 Opus	`claude-3-5-opus-20240229`	1
Anthropic	Claude 3.5 Sonnet	`claude-3-5-sonnet-20240229`	1
Anthropic	Claude 2.1	`claude-2.1`	1
Anthropic	Claude 2.0	`claude-2.0`	1
OpenAI	o1	`o1`	1
OpenAI	o1	`o1-2024-12-17`	1
OpenAI	o1-mini	`o1-mini`	1
OpenAI	o1-mini-2024-09-12	`o1-mini-2024-09-12`	1
OpenAI	o1-preview	`o1-preview`	1
OpenAI	o1-preview-2024-09-12	`o1-preview-2024-09-12`	1
OpenAI	o3-mini	`o3-mini`	1
OpenAI	o3-mini-2025-01-31	`o3-mini-2025-01-31`	1
OpenAI	gpt-4.1	`gpt-4.1`	1
OpenAI	gpt-4.1-2025-04-14	`gpt-4.1-2025-04-14`	1
OpenAI	gpt-4.1-mini	`gpt-4.1-mini`	1
OpenAI	gpt-4.1-mini-2025-04-14	`gpt-4.1-mini-2025-04-14`	1
OpenAI	gpt-4.1-nano	`gpt-4.1-nano`	1
OpenAI	gpt-4.1-nano-2025-04-14	`gpt-4.1-nano-2025-04-14`	1
OpenAI	gpt-4o	`gpt-4o`	1
OpenAI	gpt-4o-2024-05-13	`gpt-4o-2024-05-13`	1
OpenAI	gpt-4o-2024-08-06	`gpt-4o-2024-08-06`	1
OpenAI	gpt-4o-2024-11-20	`gpt-4o-2024-11-20`	1
OpenAI	gpt-4o-audio-preview	`gpt-4o-audio-preview`	1
OpenAI	gpt-4o-audio-preview-2024-10-01	`gpt-4o-audio-preview-2024-10-01`	1
OpenAI	gpt-4o-audio-preview-2024-12-17	`gpt-4o-audio-preview-2024-12-17`	1
OpenAI	gpt-4o-search-preview	`gpt-4o-search-preview`	1
OpenAI	gpt-4o-search-preview-2025-03-11	`gpt-4o-search-preview-2025-03-11`	1
OpenAI	gpt-4o-mini	`gpt-4o-mini`	1
OpenAI	gpt-4o-mini-2024-07-18	`gpt-4o-mini-2024-07-18`	1
OpenAI	gpt-4-turbo	`gpt-4-turbo`	1
OpenAI	gpt-4-turbo-2024-04-09	`gpt-4-turbo-2024-04-09`	1
OpenAI	gpt-4-turbo-preview	`gpt-4-turbo-preview`	1
OpenAI	gpt-4-0125-preview	`gpt-4-0125-preview`	1
OpenAI	gpt-4-1106-preview	`gpt-4-1106-preview`	1
OpenAI	gpt-4	`gpt-4`	1
OpenAI	gpt-4-0613	`gpt-4-0613`	1
OpenAI	gpt-4.5-preview	`gpt-4.5-preview`	1
OpenAI	gpt-4.5-preview-2025-02-27	`gpt-4.5-preview-2025-02-27`	1
OpenAI	gpt-3.5-turbo-0125	`gpt-3.5-turbo-0125`	1
OpenAI	gpt-3.5-turbo	`gpt-3.5-turbo`	1
OpenAI	gpt-3.5-turbo-1106	`gpt-3.5-turbo-1106`	1
OpenAI	gpt-3.5-turbo	`gpt-3.5-turbo`	1
OpenAI	gpt-3.5-turbo-1106	`gpt-3.5-turbo-1106`	1
OpenAI	chatgpt-4o-latest	`chatgpt-4o-latest`	1
Google	gemini-pro-vision	`gemini-pro-vision`	1
Google	gemini-1.5-pro-latest	`gemini-1.5-pro-latest`	1
Google	gemini-1.5-pro-001	`gemini-1.5-pro-001`	1
Google	gemini-1.5-pro-002	`gemini-1.5-pro-002`	1
Google	gemini-1.5-pro	`gemini-1.5-pro`	1
Google	gemini-1.5-flash-latest	`gemini-1.5-flash-latest`	1
Google	gemini-1.5-flash-001	`gemini-1.5-flash-001`	1
Google	gemini-1.5-flash-001-tuning	`gemini-1.5-flash-001-tuning`	1
Google	gemini-1.5-flash	`gemini-1.5-flash`	1
Google	gemini-1.5-flash-002	`gemini-1.5-flash-002`	1
Google	gemini-1.5-flash-8b	`gemini-1.5-flash-8b`	1
Google	gemini-1.5-flash-8b-001	`gemini-1.5-flash-8b-001`	1
Google	gemini-1.5-flash-8b-latest	`gemini-1.5-flash-8b-latest`	1
Google	gemini-1.5-flash-8b-exp-0827	`gemini-1.5-flash-8b-exp-0827`	1
Google	gemini-1.5-flash-8b-exp-0924	`gemini-1.5-flash-8b-exp-0924`	1
Google	gemini-2.5-pro-exp-03-25	`gemini-2.5-pro-exp-03-25`	1
Google	gemini-2.5-pro-preview-03-25	`gemini-2.5-pro-preview-03-25`	1
Google	gemini-2.5-flash-preview-04-17	`gemini-2.5-flash-preview-04-17`	1
Google	gemini-2.0-flash-exp	`gemini-2.0-flash-exp`	1
Google	gemini-2.0-flash	`gemini-2.0-flash`	1
Google	gemini-2.0-flash-001	`gemini-2.0-flash-001`	1
Google	gemini-2.0-flash-exp-image-generation	`gemini-2.0-flash-exp-image-generation`	1
Google	gemini-2.0-flash-lite-001	`gemini-2.0-flash-lite-001`	1
Google	gemini-2.0-flash-lite	`gemini-2.0-flash-lite`	1
Google	gemini-2.0-flash-lite-preview-02-05	`gemini-2.0-flash-lite-preview-02-05`	1
Google	gemini-2.0-flash-lite-preview	`gemini-2.0-flash-lite-preview`	1
Google	gemini-2.0-pro-exp	`gemini-2.0-pro-exp`	1
Google	gemini-2.0-pro-exp-02-05	`gemini-2.0-pro-exp-02-05`	1
Google	gemini-2.0-flash-thinking-exp-01-21	`gemini-2.0-flash-thinking-exp-01-21`	1
Google	gemini-2.0-flash-thinking-exp	`gemini-2.0-flash-thinking-exp`	1
Google	gemini-2.0-flash-thinking-exp-1219	`gemini-2.0-flash-thinking-exp-1219`	1
Google	gemini-2.0-flash-live-001	`gemini-2.0-flash-live-001`	1

Embedding

Embedding models provide vector representations of text for similarity matching and other applications.

Provider	Model	Slug	Model Unit Multiplier
Hugging Face	MiniLM-L6-v2	`sentence-transformers/all-MiniLM-L6-v2`	0.1
Hugging Face	MiniLM-L6-v2	`sentence-transformers/all-MiniLM-L6-v2`	0.1

Logging

By default, all model invocations are logged for future display in the console. If you’d like to opt out of model logging, please contact us.

Billing

Hypermode Model Router simplifies how you pay for model consumption, billing by model compute time and type. Model Units are included with all paid plans and are calculated by taking the Model Unit Multiplier times the seconds of model runtime across all requests.

Getting Started

Graphs

Tools

Resources

Setup

Generate API key

Connect via framework

Connect directly via API

Available models

Generation

Embedding

Logging

Billing

Getting Started

Graphs

Tools

Resources

​Setup

​Generate API key

​Connect via framework

​Connect directly via API

​Available models

​Generation

​Embedding

​Logging

​Billing

Setup

Generate API key

Connect via framework

Connect directly via API

Available models

Generation

Embedding

Logging

Billing