๐ค MN Bots Chat Ai API
AI chat API with streaming support, multi-model routing, and rate limiting.
Available endpoints: /chat ยท /models ยท /health
๐งช Chat Playground
Interact with the /chat endpoint directly from your browser.
Response will appear here...
Sends a conversation to the AI backend and returns a reply. Supports single-turn questions, multi-turn conversation history, custom system prompts, and real-time token streaming.
Request body โ Content-Type: application/json
| Field | Type | Default | Description |
|---|---|---|---|
messages required |
array | โ | Ordered list of conversation turns. Each item must have "role" and "content".Roles: system (optional, first only), user, assistant.Minimum: one user message. |
model optional |
string | env DEFAULT_MODEL | Model ID or alias. See the /models endpoint for the full list. Accepts: openai/gpt-oss-120b, openai/gpt-oss-20b, llama-3.3-70b, gpt-4o-mini (alias โ default). |
stream optional |
boolean | false |
Set to true to receive the reply as a Server-Sent Events stream (text/event-stream). The stream follows the OpenAI SSE format and ends with data: [DONE].Set to false (default) for a standard JSON response. |
Response โ non-stream (stream: false)
{
"response": "The assistant's reply text here."
}
On error the response contains an error.message field and an appropriate HTTP status code (400, 429, 500, 502).
Response โ stream (stream: true)
Content-Type is text/event-stream. Each line is a Server-Sent Event in OpenAI delta format:
data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"},...}]}
data: {"id":"...","choices":[{"delta":{"content":" world"},...}]}
data: [DONE]
Model aliases
| Alias you send | Resolves to |
|---|---|
gpt-4o-mini | Value of env DEFAULT_MODEL |
openai/gpt-oss-120b | openai/gpt-oss-120b (unchanged) |
openai/gpt-oss-20b | openai/gpt-oss-20b (unchanged) |
llama-3.3-70b | llama-3.3-70b-versatile |
| (any unrecognised string) | Value of env DEFAULT_MODEL |
Rate limiting
Each unique IP address is limited to 20 requests per 60-second window. Exceeding this returns HTTP 429 with {"error":{"message":"Rate limit exceeded"}}.
Examples
cURL โ simple single-turn question
curl -X POST https://mn-chat-bot-api.vercel.app/chat \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "user", "content": "What is the capital of France?" }
]
}'
cURL โ custom model
curl -X POST https://mn-chat-bot-api.vercel.app/chat \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-oss-120b",
"messages": [
{ "role": "user", "content": "Summarise the French Revolution in 3 bullet points." }
]
}'
cURL โ system prompt + multi-turn conversation
curl -X POST https://mn-chat-bot-api.vercel.app/chat \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.3-70b",
"messages": [
{ "role": "system", "content": "You are a pirate who only speaks in nautical metaphors." },
{ "role": "user", "content": "How do I sort a list in Python?" },
{ "role": "assistant", "content": "Arr, to sort yer list ye must call list.sort(), as sure as the tide!" },
{ "role": "user", "content": "What about descending order?" }
]
}'
cURL โ streaming response (SSE)
curl -X POST https://mn-chat-bot-api.vercel.app/chat \
-H "Content-Type: application/json" \
--no-buffer \
-d '{
"stream": true,
"messages": [
{ "role": "user", "content": "Write a short poem about the sea." }
]
}'
JavaScript (fetch) โ non-stream
const res = await fetch('/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
model: 'openai/gpt-oss-120b',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Explain async/await in JavaScript.' }
]
})
});
const data = await res.json();
console.log(data.response);
JavaScript (fetch) โ real-time streaming
const res = await fetch('/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
stream: true,
messages: [{ role: 'user', content: 'Tell me a long story about a robot.' }]
})
});
const reader = res.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const lines = decoder.decode(value).split('\n');
for (const line of lines) {
if (!line.startsWith('data: ')) continue;
const json = line.slice(6).trim();
if (json === '[DONE]') { console.log('\n[stream ended]'); break; }
try {
const chunk = JSON.parse(json);
const token = chunk.choices?.[0]?.delta?.content ?? '';
process.stdout.write(token); // or append to DOM
} catch (_) {}
}
}
Python (requests) โ non-stream
import requests
r = requests.post('https://mn-chat-bot-api.vercel.app/chat', json={
'model': 'openai/gpt-oss-120b',
'messages': [
{'role': 'system', 'content': 'You are a concise technical writer.'},
{'role': 'user', 'content': 'What is a REST API?'}
]
})
print(r.json()['response'])
Python (httpx) โ streaming
import httpx, json
with httpx.stream('POST', 'https://mn-chat-bot-api.vercel.app/chat', json={
'stream': True,
'messages': [{'role': 'user', 'content': 'Count from 1 to 20 slowly.'}]
}) as r:
for line in r.iter_lines():
if not line.startswith('data: '):
continue
payload = line[6:].strip()
if payload == '[DONE]':
break
delta = json.loads(payload)
token = delta['choices'][0]['delta'].get('content', '')
print(token, end='', flush=True)
PHP (curl) โ non-stream
$ch = curl_init('https://mn-chat-bot-api.vercel.app/chat');
curl_setopt_array($ch, [
CURLOPT_POST => true,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HTTPHEADER => ['Content-Type: application/json'],
CURLOPT_POSTFIELDS => json_encode([
'model' => 'openai/gpt-oss-120b',
'messages' => [
['role' => 'user', 'content' => 'What is PHP used for?']
],
]),
]);
$raw = curl_exec($ch);
$data = json_decode($raw, true);
echo $data['response'];
Node.js (https) โ non-stream
const https = require('https');
const body = JSON.stringify({
model: 'llama-3.3-70b',
messages: [{ role: 'user', content: 'What is Node.js?' }]
});
const req = https.request({
hostname: 'mn-chat-bot-api.vercel.app',
path: '/chat',
method: 'POST',
headers: { 'Content-Type': 'application/json', 'Content-Length': Buffer.byteLength(body) }
}, res => {
let data = '';
res.on('data', c => data += c);
res.on('end', () => console.log(JSON.parse(data).response));
});
req.write(body);
req.end();
Error responses
| HTTP Status | Cause | Body |
|---|---|---|
| 400 | Missing or invalid messages, malformed JSON | {"error":{"message":"..."}} |
| 429 | Rate limit exceeded (20 req / 60 s per IP) | {"error":{"message":"Rate limit exceeded"}} |
| 500 | API key not configured on server | {"error":{"message":"..."}} |
| 502 | AI backend unreachable or returned invalid data | {"error":{"message":"..."}} |
๐ก Other Endpoints
GET /models
Returns the configured default model and all available model aliases.
GET /models
โ {
"default_model": "openai/gpt-oss-120b",
"aliases": {
"gpt-4o-mini": "openai/gpt-oss-120b",
"openai/gpt-oss-120b": "openai/gpt-oss-120b",
"openai/gpt-oss-20b": "openai/gpt-oss-20b",
"llama-3.3-70b": "llama-3.3-70b-versatile"
}
}
GET /health
Liveness probe. Returns 200 OK when the server is up.
GET /health
โ {
"status": "ok",
"runtime": "php",
"timestamp": "2025-01-01T00:00:00+00:00"
}
Credits
This API is created by MN TG aka Musammil N.
- GitHub: github.com/mntgxo
- GitHub Organization: github.com/mnbots
- Telegram: t.me/mntgxo
- Contact in Telegram: t.me/mrmntg
- Support group: t.me/mnbots_support
- Update channel: t.me/mnbots