API Documentation
Complete reference for integrating with alAPI's LLM and OCR services
https://dev.alapi.deep.sa/v1
OpenAI Compatible Use the official OpenAI SDK with our base URL. Drop-in replacement for existing applications.
LLM API
OpenAI-compatible API for chat completions and embeddings. Use your favorite models through a unified interface.
Authentication
All API requests require authentication using a Bearer token in the Authorization header.
Request Header:
Authorization: Bearer YOUR_API_KEY
API Key: Generate API keys from your Dashboard.
SDK Setup Example:
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://dev.alapi.deep.sa/v1"
)
Chat Completions
Create Chat Completion
Creates a model response for the given chat conversation
Endpoint:
https://dev.alapi.deep.sa/v1/chat/completions
Request Body:
{
"model": "llama-3.3-70b-versatile",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 1024,
"stream": false
}
Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
model |
string | Yes | ID of the model to use |
messages |
array | Yes | Array of message objects with role and content |
temperature |
number | No | Sampling temperature (0-2). Default: 1 |
max_tokens |
integer | No | Maximum tokens to generate |
stream |
boolean | No | If true, returns a stream of events |
top_p |
number | No | Nucleus sampling parameter. Default: 1 |
Response (200 OK):
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1706745000,
"model": "llama-3.3-70b-versatile",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 20,
"completion_tokens": 10,
"total_tokens": 30
}
}
Code Examples:
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://dev.alapi.deep.sa/v1"
)
response = client.chat.completions.create(
model="llama-3.3-70b-versatile",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of Saudi Arabia?"}
],
temperature=0.7,
max_tokens=1024
)
print(response.choices[0].message.content)
Streaming Responses
Server-Sent Events (SSE)
Stream responses token by token for real-time output
How it works: Set stream: true in your request. The response will be sent as Server-Sent Events, with each chunk containing a delta of the response content.
Code Examples:
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://dev.alapi.deep.sa/v1"
)
stream = client.chat.completions.create(
model="llama-3.3-70b-versatile",
messages=[
{"role": "user", "content": "Write a short poem about coding"}
],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
Embeddings
Create Embeddings
Creates an embedding vector representing the input text
Endpoint:
https://dev.alapi.deep.sa/v1/embeddings
Request Body:
{
"model": "text-embedding-3-small",
"input": "The quick brown fox jumps over the lazy dog"
}
Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
model |
string | Yes | ID of the embedding model to use |
input |
string | array | Yes | Text to embed. Can be a string or array of strings |
encoding_format |
string | No | Format for the embeddings: 'float' or 'base64'. Default: float |
dimensions |
integer | No | Number of dimensions for the output embeddings (model-dependent) |
Response (200 OK):
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.0023064255, -0.009327292, ...]
}
],
"model": "text-embedding-3-small",
"usage": {
"prompt_tokens": 9,
"total_tokens": 9
}
}
Code Examples:
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://dev.alapi.deep.sa/v1"
)
response = client.embeddings.create(
model="text-embedding-3-small",
input="The quick brown fox jumps over the lazy dog"
)
embedding = response.data[0].embedding
print(f"Embedding dimension: {len(embedding)}")
Available Models
The following models are currently available through alAPI. Use the model name in your API requests.
| Model Name | Provider | Avg Latency |
|---|---|---|
deep-sa/alEmbedding
|
deepcloud | - |
deep-sa/alLLM
|
deepcloud | ~6710.25ms |
google/gemini-2.5-flash
|
google_gemini | - |
google/gemini-2.5-flash-lite
|
google_gemini | - |
google/gemini-3-flash
|
google_gemini | ~10707ms |
google/gemini-3-pro
|
google_gemini | - |
gpt-oss-120b
|
groq | - |
gpt-oss-20b
|
groq | - |
llama-3.3-70b
|
groq | - |
llama-4-maverick-17b
|
groq | - |
opanai/gpt-5-mini
|
openai | - |
qwen3-32b
|
groq | - |
OCR API
Extract text from documents with advanced Arabic and English OCR. Supports PDFs and images with automatic deskewing and layout detection.
Supported file formats
Supports Arabic and English text extraction with automatic language detection
Upload Document
Upload File for OCR
Upload a document and start OCR processing
Endpoint:
https://dev.alapi.deep.sa/v1/ocr/upload
Request Format:
Content-Type: multipart/form-data
| Name | Type | Description |
|---|---|---|
file |
file | Document file (PDF) |
consent_research |
boolean | Consent to use data for research (default: false) |
Response (200 OK):
{
"token": "abc123xyz...",
"status": "queued",
"progress": 0,
"upload_progress": 100,
"queue_position": 1
}
Code Examples:
import requests
url = "https://dev.alapi.deep.sa/v1/ocr/upload"
headers = {"Authorization": "Bearer YOUR_API_KEY"}
with open("document.pdf", "rb") as f:
files = {"file": f}
data = {"consent_research": "false"}
response = requests.post(url, headers=headers, files=files, data=data)
result = response.json()
token = result["token"]
print(f"Job started, token: {token}")
Job Status
Get Job Status
Retrieve the current status and results of an OCR job
Endpoint:
https://dev.alapi.deep.sa/v1/ocr/jobs/{token}
Status Values:
| Status | Description |
|---|---|
queued
|
Job is waiting in the processing queue |
processing
|
Document is being processed |
done
|
Processing complete. Results available in the pages array |
error
|
Processing failed. Check the error field for details |
Response (when status is done):
{
"filename": "document.pdf",
"status": "done",
"progress": 100,
"total_pages": 5,
"pages": [
{
"page_num": 1,
"text": "Extracted text from page 1...",
"num_segments": 12,
"elapsed": 2.45
}
],
"expires_at": "2026-02-05T18:30:00Z"
}
Complete OCR Flow:
import requests
import time
API_BASE = "https://dev.alapi.deep.sa/v1"
API_KEY = "YOUR_API_KEY"
headers = {"Authorization": f"Bearer {API_KEY}"}
# 1. Upload document
with open("document.pdf", "rb") as f:
response = requests.post(
f"{API_BASE}/ocr/upload",
headers=headers,
files={"file": f},
data={"consent_research": "false"}
)
result = response.json()
token = result["token"]
# 2. Poll for completion
while True:
status_response = requests.get(
f"{API_BASE}/ocr/jobs/{token}",
headers=headers
)
job = status_response.json()
print(f"Status: {job['status']}, Progress: {job.get('progress', 0)}%")
if job["status"] == "done":
# 3. Extract text from all pages
full_text = "\n\n".join(page["text"] for page in job["pages"])
print("Extracted text:", full_text)
break
elif job["status"] == "error":
print("Error:", job.get("error"))
break
time.sleep(2) # Poll every 2 seconds
Get Single Page
Get Page by Number
Retrieve OCR text and thumbnail for a single page
Endpoint:
https://dev.alapi.deep.sa/v1/ocr/jobs/{token}/{page_num}
Request Format:
| Name | Type | Description |
|---|---|---|
token |
String | Access token received from upload |
page_num |
Integer | Page number (1-indexed) |
Response (200 OK):
{
"page_num": 1,
"text": "Extracted text from this page...",
"thumbnail": {
"url": "https://objectstorage.me-jeddah-1.oraclecloud.com/..."
},
"status": "done",
"num_segments": 12,
"elapsed": 2.34
}
Code Examples:
import requests
API_BASE = "https://dev.alapi.deep.sa/v1"
API_KEY = "YOUR_API_KEY"
headers = {"Authorization": f"Bearer {API_KEY}"}
token = "YOUR_TOKEN_HERE"
page_num = 3
response = requests.get(
f"{API_BASE}/ocr/jobs/{token}/{page_num}",
headers=headers
)
page = response.json()
print(f"Page {page['page_num']}: {page['status']}")
print(page["text"])
Error Handling:
| Status | Description |
|---|---|
400 |
page_num must be >= 1 |
404 |
Job or page not found |
410 |
Job has expired |
Thumbnails
Get Page Thumbnails
Retrieve page thumbnails with pagination
Endpoint:
https://dev.alapi.deep.sa/v1/ocr/thumbnails/{token}?start=0&limit=10
Response:
{
"thumbnails": [
{
"page_num": 1,
"data": "data:image/png;base64,iVBORw0KGgo..."
}
],
"start": 0,
"end": 10,
"total_pages": 25,
"has_more": true
}
Error Handling
The API uses standard HTTP status codes to indicate success or failure of requests.
Error Response Format:
{
"error": {
"message": "Error description",
"type": "invalid_request_error",
"code": "invalid_api_key"
}
}
Ready to Get Started?
Generate an API key from your dashboard and start building