๐Ÿค– MN Bots Chat Ai API

AI chat API with streaming support, multi-model routing, and rate limiting.

Available endpoints: /chat  ยท  /models  ยท  /health

๐Ÿงช Chat Playground

Interact with the /chat endpoint directly from your browser.

Response will appear here...
POST /chat Rate limit: 20 req / 60 s JSON & SSE streaming

Sends a conversation to the AI backend and returns a reply. Supports single-turn questions, multi-turn conversation history, custom system prompts, and real-time token streaming.


Request body โ€” Content-Type: application/json

FieldTypeDefaultDescription
messages required array โ€” Ordered list of conversation turns. Each item must have "role" and "content".
Roles: system (optional, first only), user, assistant.
Minimum: one user message.
model optional string env DEFAULT_MODEL Model ID or alias. See the /models endpoint for the full list.
Accepts: openai/gpt-oss-120b, openai/gpt-oss-20b, llama-3.3-70b, gpt-4o-mini (alias โ†’ default).
stream optional boolean false Set to true to receive the reply as a Server-Sent Events stream (text/event-stream). The stream follows the OpenAI SSE format and ends with data: [DONE].
Set to false (default) for a standard JSON response.

Response โ€” non-stream (stream: false)

{
  "response": "The assistant's reply text here."
}

On error the response contains an error.message field and an appropriate HTTP status code (400, 429, 500, 502).

Response โ€” stream (stream: true)

Content-Type is text/event-stream. Each line is a Server-Sent Event in OpenAI delta format:

data: {"id":"...","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"},...}]}
data: {"id":"...","choices":[{"delta":{"content":" world"},...}]}
data: [DONE]

Model aliases

Alias you sendResolves to
gpt-4o-miniValue of env DEFAULT_MODEL
openai/gpt-oss-120bopenai/gpt-oss-120b (unchanged)
openai/gpt-oss-20bopenai/gpt-oss-20b (unchanged)
llama-3.3-70bllama-3.3-70b-versatile
(any unrecognised string)Value of env DEFAULT_MODEL

Rate limiting

Each unique IP address is limited to 20 requests per 60-second window. Exceeding this returns HTTP 429 with {"error":{"message":"Rate limit exceeded"}}.


Examples

cURL โ€” simple single-turn question
curl -X POST https://mn-chat-bot-api.vercel.app/chat \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      { "role": "user", "content": "What is the capital of France?" }
    ]
  }'
cURL โ€” custom model
curl -X POST https://mn-chat-bot-api.vercel.app/chat \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-oss-120b",
    "messages": [
      { "role": "user", "content": "Summarise the French Revolution in 3 bullet points." }
    ]
  }'
cURL โ€” system prompt + multi-turn conversation
curl -X POST https://mn-chat-bot-api.vercel.app/chat \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3.3-70b",
    "messages": [
      { "role": "system",    "content": "You are a pirate who only speaks in nautical metaphors." },
      { "role": "user",      "content": "How do I sort a list in Python?" },
      { "role": "assistant", "content": "Arr, to sort yer list ye must call list.sort(), as sure as the tide!" },
      { "role": "user",      "content": "What about descending order?" }
    ]
  }'
cURL โ€” streaming response (SSE)
curl -X POST https://mn-chat-bot-api.vercel.app/chat \
  -H "Content-Type: application/json" \
  --no-buffer \
  -d '{
    "stream": true,
    "messages": [
      { "role": "user", "content": "Write a short poem about the sea." }
    ]
  }'
JavaScript (fetch) โ€” non-stream
const res = await fetch('/chat', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    model: 'openai/gpt-oss-120b',
    messages: [
      { role: 'system', content: 'You are a helpful assistant.' },
      { role: 'user',   content: 'Explain async/await in JavaScript.' }
    ]
  })
});
const data = await res.json();
console.log(data.response);
JavaScript (fetch) โ€” real-time streaming
const res = await fetch('/chat', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    stream: true,
    messages: [{ role: 'user', content: 'Tell me a long story about a robot.' }]
  })
});

const reader = res.body.getReader();
const decoder = new TextDecoder();

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  const lines = decoder.decode(value).split('\n');
  for (const line of lines) {
    if (!line.startsWith('data: ')) continue;
    const json = line.slice(6).trim();
    if (json === '[DONE]') { console.log('\n[stream ended]'); break; }
    try {
      const chunk = JSON.parse(json);
      const token = chunk.choices?.[0]?.delta?.content ?? '';
      process.stdout.write(token); // or append to DOM
    } catch (_) {}
  }
}
Python (requests) โ€” non-stream
import requests

r = requests.post('https://mn-chat-bot-api.vercel.app/chat', json={
    'model': 'openai/gpt-oss-120b',
    'messages': [
        {'role': 'system',  'content': 'You are a concise technical writer.'},
        {'role': 'user',    'content': 'What is a REST API?'}
    ]
})
print(r.json()['response'])
Python (httpx) โ€” streaming
import httpx, json

with httpx.stream('POST', 'https://mn-chat-bot-api.vercel.app/chat', json={
    'stream': True,
    'messages': [{'role': 'user', 'content': 'Count from 1 to 20 slowly.'}]
}) as r:
    for line in r.iter_lines():
        if not line.startswith('data: '):
            continue
        payload = line[6:].strip()
        if payload == '[DONE]':
            break
        delta = json.loads(payload)
        token = delta['choices'][0]['delta'].get('content', '')
        print(token, end='', flush=True)
PHP (curl) โ€” non-stream
$ch = curl_init('https://mn-chat-bot-api.vercel.app/chat');
curl_setopt_array($ch, [
    CURLOPT_POST           => true,
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_HTTPHEADER     => ['Content-Type: application/json'],
    CURLOPT_POSTFIELDS     => json_encode([
        'model'    => 'openai/gpt-oss-120b',
        'messages' => [
            ['role' => 'user', 'content' => 'What is PHP used for?']
        ],
    ]),
]);
$raw  = curl_exec($ch);
$data = json_decode($raw, true);
echo $data['response'];
Node.js (https) โ€” non-stream
const https = require('https');

const body = JSON.stringify({
  model: 'llama-3.3-70b',
  messages: [{ role: 'user', content: 'What is Node.js?' }]
});

const req = https.request({
  hostname: 'mn-chat-bot-api.vercel.app',
  path: '/chat',
  method: 'POST',
  headers: { 'Content-Type': 'application/json', 'Content-Length': Buffer.byteLength(body) }
}, res => {
  let data = '';
  res.on('data', c => data += c);
  res.on('end', () => console.log(JSON.parse(data).response));
});
req.write(body);
req.end();

Error responses

HTTP StatusCauseBody
400Missing or invalid messages, malformed JSON{"error":{"message":"..."}}
429Rate limit exceeded (20 req / 60 s per IP){"error":{"message":"Rate limit exceeded"}}
500API key not configured on server{"error":{"message":"..."}}
502AI backend unreachable or returned invalid data{"error":{"message":"..."}}

๐Ÿ“ก Other Endpoints

GET /models

Returns the configured default model and all available model aliases.

GET /models

โ†’ {
  "default_model": "openai/gpt-oss-120b",
  "aliases": {
    "gpt-4o-mini": "openai/gpt-oss-120b",
    "openai/gpt-oss-120b": "openai/gpt-oss-120b",
    "openai/gpt-oss-20b": "openai/gpt-oss-20b",
    "llama-3.3-70b": "llama-3.3-70b-versatile"
  }
}

GET /health

Liveness probe. Returns 200 OK when the server is up.

GET /health

โ†’ {
  "status": "ok",
  "runtime": "php",
  "timestamp": "2025-01-01T00:00:00+00:00"
}

Credits

This API is created by MN TG aka Musammil N.