Text Generation
Generate human-like text using state-of-the-art LLMs via the OpenAI-compatible Chat Completions API.
OpenAI Compatible: Use the standard
openai Python library just set base_url to https://api.oxlo.ai/v1. Any existing OpenAI code works without changes.Available Models
Chat Models
| Model | API ID | Tier | Best For |
|---|---|---|---|
| Llama-3.2-3B | llama-3.2-3b | Free | Fast responses, edge deployment |
| Mistral-7B | mistral-7b | Free | Chat, summaries, basic reasoning |
| DeepSeek V3.2 | deepseek-v3.2 | Free | General-purpose chat, analysis |
| Llama-3.1-8B | llama-3.1-8b | Pro | Instruction following, reasoning |
| Qwen 2.5 7B | qwen-2.5-7b | Pro | Multilingual, general reasoning |
| Ministral-14B | ministral-14b | Pro | Multilingual, vision-capable |
| Llama-4-Maverick-17B | llama-4-maverick-17b | Pro | Versatile MoE, diverse tasks |
| DeepSeek V3 0324 | deepseek-v3-0324 | Pro | Enhanced general-purpose chat |
| Kimi-2.5 | kimi-k2.5 | Premium | Balanced reasoning, vision |
| Qwen 3 32B | qwen-3-32b | Premium | Advanced reasoning, enterprise |
| Llama-3.3-70B | llama-3.3-70b | Premium | Long context, instruction following |
| GPT-OSS 120B | gpt-oss-120b | Premium | Complex reasoning, long context |
Reasoning Models
| Model | API ID | Tier | Best For |
|---|---|---|---|
| DeepSeek R1 8B | deepseek-r1-8b | Free | Math, science, chain-of-thought |
| DeepSeek R1 70B | deepseek-r1-70b | Pro | Advanced reasoning, analysis |
| GPT-OSS 20B | gpt-oss-20b | Pro | Agentic workflows, complex tasks |
| Kimi-K2-Thinking | kimi-k2-thinking | Premium | Deep reasoning, long-form analysis |
| DeepSeek-R1-0528 | deepseek-r1-0528 | Premium | Frontier-class reasoning |
Coding Models
| Model | API ID | Tier | Best For |
|---|---|---|---|
| DeepSeek Coder 33B | deepseek-coder-33b | Pro | Code generation, refactoring |
Quick Example
Chat with any model using the OpenAI-compatible API:
import openai
client = openai.OpenAI(
base_url="https://api.oxlo.ai/v1",
api_key="<YOUR_API_KEY>"
)
response = client.chat.completions.create(
model="deepseek-r1-8b",
messages=[{"role": "user", "content": "Hello!"}],
max_tokens=512
)
print(response.choices[0].message.content)Multi-Turn Conversation
Pass conversation history in the messages array:
response = client.chat.completions.create(
model="llama-3.3-70b",
messages=[
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "Write a Python function to reverse a string"},
{"role": "assistant", "content": "def reverse(s): return s[::-1]"},
{"role": "user", "content": "Now make it handle Unicode correctly"}
],
max_tokens=1024,
temperature=0.3
)Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| model | string | — | Model ID (required) |
| messages | array | — | Conversation messages (required) |
| max_tokens | integer | 256 | Max tokens to generate (1–131,072) |
| temperature | float | 0.7 | Randomness (0.0 = deterministic, 2.0 = creative) |
| top_p | float | 1.0 | Nucleus sampling threshold |
| frequency_penalty | float | 0 | Penalize repeated tokens (-2.0 to 2.0) |
| presence_penalty | float | 0 | Penalize tokens already present (-2.0 to 2.0) |
| stop | string[] | null | Up to 4 stop sequences |
| seed | integer | null | For reproducible results |