Embeddings
Convert text into vector representations for semantic search, RAG, and similarity matching.
OpenAI Compatible: Use the standard
openai library just set base_url to https://api.oxlo.ai/v1.Available Models
| Model | API ID | Dimensions | Best For |
|---|---|---|---|
| BGE-Large | bge-large | 1024 | Semantic search, RAG retrieval |
| E5-Large | e5-large | 1024 | Multilingual retrieval, global apps |
Quick Example
Generate embeddings using the OpenAI SDK:
python
import openai
client = openai.OpenAI(
base_url="https://api.oxlo.ai/v1",
api_key="<YOUR_API_KEY>"
)
response = client.embeddings.create(
model="bge-large",
input="Oxlo.ai is a distributed GPU network for AI inference."
)
embedding = response.data[0].embedding
print(f"Dimensions: {len(embedding)}")
print(f"First 5 values: {embedding[:5]}")Batch Embeddings
Embed multiple texts in a single request:
python
response = client.embeddings.create(
model="bge-large",
input=[
"How to deploy a machine learning model",
"Best practices for GPU optimization",
"Introduction to vector databases"
]
)
for i, item in enumerate(response.data):
print(f"Text {i}: {len(item.embedding)} dimensions")RAG Example
Use embeddings for retrieval-augmented generation:
python
import numpy as np
# 1. Embed your knowledge base
docs = [
"Oxlo.ai supports text generation, image generation, and embeddings.",
"Free tier includes Mistral-7B, Llama-3.2-3B, and Stable Diffusion 1.5.",
"Premium users get access to Llama-3.3-70B and DeepSeek R1 70B.",
]
doc_response = client.embeddings.create(model="bge-large", input=docs)
doc_vectors = [np.array(d.embedding) for d in doc_response.data]
# 2. Embed user's query
query = "What models are available for free?"
query_response = client.embeddings.create(model="bge-large", input=query)
query_vector = np.array(query_response.data[0].embedding)
# 3. Find most similar document (cosine similarity)
similarities = [
np.dot(query_vector, doc_vec) / (np.linalg.norm(query_vector) * np.linalg.norm(doc_vec))
for doc_vec in doc_vectors
]
best_match = docs[np.argmax(similarities)]
print(f"Best match: {best_match}")Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | No | Embedding model (default: bge-large) |
| input | string | string[] | Yes | Text(s) to embed single string or array |
| encoding_format | string | No | Output encoding (default: float) |