Developer and Enterprise AI Infrastructure

Select a model, get an endpoint.
Push Completions

Deploy AI models, leverage an OpenAI-compatible API, GPU and CPU support, developer simplicity, and enterprise reliability.

Start Free Trial -> Read Docs

API Example

# OpenAI-compatible API - just change your base URL
$ curl https://api.xerotier.ai/your-project/your-endpoint/v1/chat/completions \
    -H "Authorization: Bearer $XEROTIER_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "your-agent",
      "messages": [{"role": "user", "content": "Hello!"}]
    }'

# Response
{
  "choices": [{
    "message": {"role": "assistant", "content": "Hello! How can I help?"}
  }]
}

Everything you need for production AI

Enterprise-grade infrastructure with developer-friendly simplicity

Bring Your Own Model

Upload safetensors models with hardware-isolated sandboxing.

Instant Deployment

Push a model, get an endpoint. Automatic scaling from zero.

Security

Runtime, data, and network isolation.

Multi-Vendor GPU

NVIDIA and AMD support giving full featured acceleration, at scale.

OpenAI Compatible

Drop-in replacement. Change one line of code.

Usage Analytics

Real-time request tracking and performance metrics.

Simple, transparent pricing

Pay per token for managed inference, or bring your own hardware with no metering.

Free

Perfect for testing and evaluation

$0.00 Forever

Shared Engine
Preemptable
10K tokens/min

Max Model Size 4 GB

Hardware CPU

Requests/min 20

Tokens/min 10,000

Get Started

Self-Hosted

Bring your own hardware and infrastructure

$10 /month

Your infrastructure
No token metering
Full data control

Max Model Size 304 GB

Hardware Your Hardware

Requests/min Unlimited

Tokens/min Unlimited

Get Started

Select a model, get an endpoint.
Push Completions

Global Agent Network

Everything you need for production AI

Bring Your Own Model

Instant Deployment

Security

Multi-Vendor GPU

OpenAI Compatible

Usage Analytics

Simple, transparent pricing

Free

Self-Hosted

Ready to deploy your first model?

Select a model, get an endpoint. Push Completions

Global Agent Network

Everything you need for production AI

Bring Your Own Model

Instant Deployment

Security

Multi-Vendor GPU

OpenAI Compatible

Usage Analytics

Simple, transparent pricing

Free

Self-Hosted

Ready to deploy your first model?

Select a model, get an endpoint.
Push Completions