Chat API

Send messages through CID222's safety pipeline to any supported LLM provider. Input and output are filtered for PII, toxicity, and prompt injection in real time, and the response is streamed back as Server-Sent Events.

Create Chat Completion

POST /chat/completions

Streams a model completion through the input and output safety filters. The response is always Server-Sent Events (SSE) — there is no non-streaming JSON body.

Authentication

Authenticate with a tenant API key (Authorization: Bearer cid_key_...) or a user JWT.

Request Body

Parameter	Type	Required	Description
`model`	string	Yes	Model ID enabled for your tenant (e.g. "gpt-4o", "claude-sonnet-4-6")
`messages`	array	Yes	Conversation messages. `content` is a string or an array of content parts
`provider`	string	No	`openai`, `azure_openai`, `anthropic`, `google`. Inferred from the model when omitted
`stream`	boolean	No	Server-Sent Events streaming. Defaults to `true` for this endpoint
`temperature`	number	No	Sampling temperature, 0–2
`max_tokens`	number	No	Maximum tokens in the response
`top_p`	number	No	Nucleus sampling, 0–1
`session_id`	string	No	Attach this turn to a conversation session (UUID)
`contexts`	string[]	No	RAG passages used for grounding / hallucination checks
`documents`	array	No	Documents (pdf/docx/txt/csv, base64) auto-parsed, redacted, and added as context
`content_filter`	string	No	Filter nickname overriding the tenant default

Message Format

// Plain text
{ "role": "user", "content": "Message content" }

// Multimodal: text + image
{
  "role": "user",
  "content": [
    { "type": "text", "text": "What's in this image?" },
    { "type": "image_url", "image_url": { "url": "data:image/png;base64,...", "detail": "auto" } }
  ]
}

// Text + document
{
  "role": "user",
  "content": [
    { "type": "text", "text": "Summarize this contract" },
    { "type": "document", "document": { "data": "JVBERi0xLjQK...", "type": "pdf", "name": "contract.pdf" } }
  ]
}

Images & documents are scanned too

Image and document content parts are scanned and redacted for PII before they reach the model, exactly like text.

Example Request

cURL

curl -N -X POST https://api.cid222.ai/chat/completions \
  -H "Authorization: Bearer cid_key_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful customer service agent."
      },
      {
        "role": "user",
        "content": "My name is John Smith and my email is john@example.com"
      }
    ]
  }'

Streaming Response

The response is a text/event-stream. Each event is a JSON object on a data: line. Model text arrives as content_block_delta events; the stream ends with message_stop followed by data: [DONE].

SSE Stream

data: {"type":"input_filter_warning","severity":"info","entity_type":"EMAIL","action":"masked","count":1}

data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello"}}

data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" John"}}

data: {"type":"context_usage","input_tokens":320,"output_tokens":12,"total_tokens":332}

data: {"type":"message_stop","total_duration_ms":840,"llm_duration_ms":610}

data: [DONE]

Event Types

The stream interleaves model output with safety-pipeline events:

Event	Description
`content_block_delta`	A chunk of model output text in `delta.text`
`input_filter_warning`	PII or safety match in the prompt that was masked or flagged before sending
`input_rejected`	Prompt blocked (e.g. jailbreak). Not forwarded to the provider
`output_filter_warning`	PII or safety match in the model response that was masked or flagged
`output_rejected`	Response blocked (e.g. PII leak or hallucination risk)
`risk_analysis`	Async Risk Guardian result (`risk_level`, `risk_score`)
`context_usage`	Token usage (`input_tokens`, `output_tokens`, `total_tokens`)
`message_stop`	Final event with a timing breakdown, followed by `data: [DONE]`

Blocked Content

If the input or output violates policy, the pipeline emits an input_rejected or output_rejected event instead of model text and ends the stream. Blocked input is never sent to the provider, so no tokens are billed.

Blocked Input

data: {"type":"input_rejected","reason":"jailbreak attempt detected","details":"Your prompt was flagged as a potential jailbreak."}

data: [DONE]

Supported Models

The model parameter accepts any model enabled for your tenant. Call GET /models to list what is available.

Provider	Models
OpenAI	`gpt-4o`, `gpt-4o-mini`, `gpt-4-turbo`, `gpt-3.5-turbo`
Anthropic	`claude-opus-4-6`, `claude-sonnet-4-6`, `claude-haiku-4-5`
Google	`gemini-2.5-pro`, `gemini-2.5-flash`
Azure OpenAI	`gpt-4o`, `gpt-4o-mini`, `gpt-4-turbo` (your deployment names)

Error Codes

Code	Description
400	Invalid request body or parameters
401	Invalid or missing authentication
403	Forbidden (insufficient role or scope)
429	Rate limit exceeded
500	Internal server error
503	Provider or ML service unavailable

Content blocks are delivered as SSE events, not as a 403 — read the event stream to detect input_rejected and output_rejected.