Parameters Reference
Every model on Oxlo supports customizable inference parameters. This page documents every available parameter across all model categories, with valid ranges, defaults, and usage examples.
Tip: All parameters are optional. If omitted, sensible defaults are used. You can also try parameters interactively in the Playground.
Chat Completions
Parameters for POST /v1/chat/completions. Compatible with OpenAI, Anthropic, and OpenRouter SDKs.
Core Parameters
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
| model | string | — | — | Model ID to use (e.g. deepseek-v3.2, mistral-7b). |
| messages | array | — | — | Array of message objects with role and content fields. |
| temperature | float | 0.7 | 0 – 2 | Controls randomness. Lower values make output more deterministic; higher values more creative. |
| max_tokens | integer | 256 | 1 – 131072 | Maximum number of tokens to generate in the response. |
| top_p | float | 1.0 | 0 – 1 | Nucleus sampling: only considers tokens with cumulative probability above this threshold. Use either temperature or top_p, not both. |
| stop | string | string[] | null | max 4 | Up to 4 sequences where the model will stop generating further tokens. |
| stream | boolean | false | — | Whether to stream back partial progress as server-sent events. |
Penalty Parameters
These parameters control repetition and diversity in the generated text.
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
| frequency_penalty | float | 0 | -2 – 2 | Penalize tokens proportionally to how often they appear. Positive values reduce repetition. |
| presence_penalty | float | 0 | -2 – 2 | Penalize tokens that have appeared at all. Encourages the model to talk about new topics. |
| repeat_penalty | float | 1.1 | 0 – 3 | Multiplicative penalty applied to repeated token sequences. 1.0 means no penalty. |
Advanced Sampling
Fine-grained control over the token prediction process. These are particularly useful for research and advanced use cases.
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
| top_k | integer | 0 | 0 – 500 | Only sample from the top K most likely tokens. 0 disables top-k filtering. |
| min_p | float | 0.05 | 0 – 1 | Minimum probability threshold relative to the most likely token. Tokens below this are excluded. |
| typical_p | float | 1.0 | 0 – 1 | Locally typical sampling. 1.0 disables it. Lower values make output more predictable. |
| tfs_z | float | 1.0 | 0 – 1 | Tail-free sampling parameter. 1.0 disables it. Lower values cut off the tail of the distribution. |
| mirostat_mode | integer | 0 | 0, 1, 2 | Mirostat sampling mode. 0 = disabled, 1 = Mirostat v1, 2 = Mirostat v2. |
| mirostat_tau | float | 5.0 | 0 – 10 | Target entropy (perplexity) for Mirostat. Lower = more focused output. |
| mirostat_eta | float | 0.1 | 0 – 1 | Mirostat learning rate. Controls how quickly the algorithm adapts. |
| seed | integer | random | 0 – 2^31 | Random seed for reproducible outputs. Same seed + same params = same output. |
| n | integer | 1 | 1 – 5 | Number of chat completions to generate for each input. |
Example
from openai import OpenAI
client = OpenAI(
base_url="https://api.oxlo.ai/v1",
api_key="YOUR_API_KEY"
)
response = client.chat.completions.create(
model="deepseek-v3.2",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
],
temperature=0.9,
max_tokens=1024,
top_p=0.95,
frequency_penalty=0.5,
presence_penalty=0.3,
stop=["\n\n"],
seed=42
)
print(response.choices[0].message.content)Compatibility: Our API is fully compatible with the OpenAI SDK. If you are using OpenAI, Anthropic, or OpenRouter, you can switch to Oxlo by just changing the base_url and api_key.
Image Generation
Parameters for POST /v1/images/generations.
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
| prompt | string | — | — | Text description of the image to generate. |
| model | string | "" | — | Model to use (e.g. oxlo-image-pro, sdxl, stable-diffusion-1.5). |
| num_inference_steps | integer | 4 | 1 – 150 | Number of denoising steps. More steps generally produce better quality but take longer. |
| guidance_scale | float | 7.5 | 1 – 30 | Classifier-free guidance scale. Higher values make the image follow the prompt more closely. |
| negative_prompt | string | "" | free text | Text describing what to avoid in the generated image. |
| width | integer | 1024 | 256 – 2048 | Image width in pixels. Must be a multiple of 64. |
| height | integer | 1024 | 256 – 2048 | Image height in pixels. Must be a multiple of 64. |
| seed | integer | random | 0 – 2^31 | Random seed for reproducible generations. |
| n | integer | 1 | 1 – 4 | Number of images to generate. |
Example
response = client.images.generate(
model="oxlo-image-pro",
prompt="A futuristic city skyline at sunset, cyberpunk style",
negative_prompt="blurry, low quality",
n=1,
size="1024x1024"
)
image_b64 = response.data[0].b64_jsonAudio: Speech-to-Text
Parameters for POST /v1/audio/transcriptions (Whisper models).
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
| file | file | — | — | Audio file to transcribe. Supports mp3, mp4, wav, webm, m4a, mpeg, mpga. |
| model | string | whisper-medium | — | Whisper model to use (whisper-medium, whisper-large, whisper-large-v3). |
| language | string | auto | ISO 639-1 | Language of the audio. Auto-detected if not specified. |
| response_format | string | json | json, text, verbose_json, srt, vtt | Output format for the transcription. |
| temperature | float | 0 | 0 – 1 | Sampling temperature for the decoding process. 0 is most deterministic. |
Example
transcript = client.audio.transcriptions.create(
model="whisper-large-v3",
file=open("meeting.mp3", "rb"),
language="en",
response_format="verbose_json"
)
print(transcript.text)Audio: Text-to-Speech
Parameters for POST /v1/audio/speech (Kokoro TTS).
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
| input | string | — | — | Text to convert to speech. |
| model | string | kokoro-82m | — | TTS model to use. |
| voice | string | af_heart | model voices | Voice to use for synthesis. Available voices can be fetched via /v1/audio/voices. |
| speed | float | 1.0 | 0.25 – 4.0 | Playback speed multiplier. |
Example
response = client.audio.speech.create(
model="kokoro-82m",
input="Hello, welcome to Oxlo AI.",
voice="af_heart",
speed=1.0
)
with open("output.wav", "wb") as f:
f.write(response.content)Object Detection
Parameters for POST /v1/detect (YOLO models).
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
| image | string | — | — | Base64-encoded image or image URL. |
| model | string | yolo11x.pt | — | YOLO model variant to use. |
| confidence | float | 0.25 | 0.01 – 1.0 | Minimum confidence threshold. Only detections above this score are returned. |
| iou_threshold | float | 0.45 | 0.1 – 1.0 | Intersection-over-Union threshold for non-maximum suppression. |
| classes | string | all | comma-separated | Comma-separated list of class names or IDs to filter. |
| max_detections | integer | 100 | 1 – 300 | Maximum number of detections to return. |
| image_size | integer | 640 | 320 – 1280 | Input image size for inference. Larger = slower but more accurate. |
Example
import requests, base64
with open("photo.jpg", "rb") as f:
image_b64 = base64.b64encode(f.read()).decode()
response = requests.post(
"https://api.oxlo.ai/v1/detect",
headers={"Authorization": "Bearer YOUR_API_KEY"},
json={
"image": image_b64,
"model": "yolo11x.pt",
"confidence": 0.3,
"iou_threshold": 0.5
}
)
for det in response.json()["detections"]:
print(f"{det['class_name']}: {det['confidence']:.2f}")Embeddings
Parameters for POST /v1/embeddings.
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
| input | string | string[] | — | — | Text or array of texts to embed. |
| model | string | bge-large | — | Embedding model to use (bge-large, e5-large). |
| encoding_format | string | float | float, base64 | Format of the output embeddings. |
Example
response = client.embeddings.create(
model="bge-large",
input=["Search query here", "Document text here"]
)
for item in response.data:
print(f"Embedding dim: {len(item.embedding)}")Parameter Discovery API
You can dynamically discover which parameters a model supports via our API. This is useful for building UIs that adapt to different model types automatically.
Response Format
{
"model_id": "deepseek-v3.2",
"model_name": "DeepSeek V3.2",
"category": "chat",
"context_length": 128000,
"parameters": {
"temperature": {
"type": "float",
"min": 0,
"max": 2,
"default": 0.7,
"step": 0.1,
"section": "basic",
"description": "Controls randomness in output generation"
},
"top_k": {
"type": "int",
"min": 0,
"max": 500,
"default": 0,
"section": "advanced",
"description": "Only sample from top K tokens (0 = disabled)"
}
}
}Note: Parameters are grouped intobasic and advanced sections. The basic section contains the most commonly used parameters, while advanced contains fine-tuning options for power users.
Need a parameter we don't support yet? Reach out to us at hello@oxlo.ai with details on the parameter and your use case. We actively review requests and add new parameters regularly.