Oxlo.ai

Vision Models

Send images alongside text prompts for multimodal understanding using the OpenAI-compatible Chat Completions API.

OpenAI Compatible: Vision works with the standard openai Python library just set base_url to https://api.oxlo.ai/v1.

Supported Models

ModelAPI IDTierVision Support
MiniMistral 14Bministral-14bProImages + Text
Llama 4 Maverickllama-4-maverickProImages + Text
Kimi K2.5kimi-k2.5PremiumImages + Text
Kimi K2.5 Thinkingkimi-k2-thinkingPremiumImages + Text
Gemma-3-4B (Coming Soon)gemma-3-4bFreeImages + Text
Gemma-27B (Coming Soon)gemma-27bPremiumImages + Text

How It Works

Vision models accept a messages array where each message's content can be either a plain string for text-only messages, or an array of content blocks mixing text and images. Image blocks support base64-encoded data URIs or public URLs.

Quick Example

Send an image with a text prompt:

import openai

client = openai.OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key="<YOUR_API_KEY>"
)

response = client.chat.completions.create(
    model="ministral-14b",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What do you see in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://example.com/photo.jpg"
                    }
                }
            ]
        }
    ],
    max_tokens=512
)

print(response.choices[0].message.content)

Base64 Image (Local File)

Send a local image file encoded as base64:

import openai
import base64

client = openai.OpenAI(
    base_url="https://api.oxlo.ai/v1",
    api_key="<YOUR_API_KEY>"
)

# Encode a local image
with open("photo.jpg", "rb") as f:
    image_b64 = base64.b64encode(f.read()).decode("utf-8")

response = client.chat.completions.create(
    model="ministral-14b",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this image in detail."},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/jpeg;base64,{image_b64}"
                    }
                }
            ]
        }
    ],
    max_tokens=1024
)

print(response.choices[0].message.content)

Content Block Format

Block TypeFieldsDescription
text{"type": "text", "text": "..."}A text prompt or question
image_url{"type": "image_url", "image_url": {"url": "..."}}A public URL or base64 data URI (data:image/jpeg;base64,...)

Tips

  • Use data:image/jpeg;base64,... or data:image/png;base64,... for local images
  • Use public URLs directly for remote images
  • Multiple images can be sent in a single message by adding more image_url blocks
  • Non-vision models will ignore image blocks and respond to the text content only