Skip to main content

OpenAI Provider

The shieldOpenAI wrapper protects OpenAI clients by hardening system prompts, detecting injections in user input, and sanitizing model output to prevent leaks.

Installation

npm install @zeroleaks/shield openai

Quick Start

import OpenAI from "openai";
import { shieldOpenAI } from "@zeroleaks/shield/openai";

const client = shieldOpenAI(new OpenAI(), {
  systemPrompt: "You are a financial advisor...",
  onDetection: "block", // throws on injection (default)
});

const response = await client.chat.completions.create({
  model: "gpt-5.3-codex",
  messages: [
    { role: "system", content: "You are a financial advisor..." },
    { role: "user", content: userInput },
  ],
});

Configuration Options

Basic Options

systemPrompt
string
System prompt for sanitization. When omitted, derived from the first system message in the request.
onDetection
'block' | 'warn'
default:"block"
  • "block": Throws InjectionDetectedError when injection is detected
  • "warn": Only invokes onInjectionDetected callback without blocking
throwOnLeak
boolean
default:false
When true, throws LeakDetectedError instead of redacting leaked content.

Feature Flags

harden
HardenOptions | false
Options for system prompt hardening. Set to false to disable hardening entirely.
detect
DetectOptions | false
Options for injection detection. Set to false to disable detection entirely.
sanitize
SanitizeOptions | false
Options for output sanitization. Set to false to disable sanitization entirely.

Streaming Options

streamingSanitize
'buffer' | 'chunked' | 'passthrough'
default:"buffer"
Controls how streaming responses are sanitized:
  • "buffer": Accumulates full response then sanitizes (higher memory, more accurate)
  • "chunked": Sanitizes in 8KB chunks (lower memory for long streams)
  • "passthrough": Skip sanitization for streams (use when you accept the risk)
streamingChunkSize
number
default:8192
Chunk size in bytes for "chunked" mode.

Callbacks

onInjectionDetected
(result: DetectResult) => void
Invoked when injection is detected. Receives detection result with risk level and matched patterns.
onLeakDetected
(result: SanitizeResult) => void
Invoked when a prompt leak is detected in the output. Receives sanitization result with confidence score.

Streaming Support

Shield automatically handles both regular and streaming responses:
const response = await client.chat.completions.create({
  model: "gpt-5.3-codex",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: userInput },
  ],
});

console.log(response.choices[0].message.content);
// Automatically sanitized

Multi-part Messages

Shield supports multi-part messages with text and images. Text content is extracted from all parts for injection detection and hardening:
const response = await client.chat.completions.create({
  model: "gpt-4-vision-preview",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    {
      role: "user",
      content: [
        { type: "text", text: "What's in this image?" },
        { type: "image_url", image_url: { url: "https://..." } },
      ],
    },
  ],
});
// Text parts are scanned for injection, images are passed through

Tool Calls

Shield automatically sanitizes tool call arguments to prevent prompt leakage through function parameters:
const response = await client.chat.completions.create({
  model: "gpt-5.3-codex",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: userInput },
  ],
  tools: [
    {
      type: "function",
      function: {
        name: "send_email",
        description: "Send an email",
        parameters: {
          type: "object",
          properties: {
            to: { type: "string" },
            subject: { type: "string" },
            body: { type: "string" },
          },
        },
      },
    },
  ],
});

const toolCall = response.choices[0].message.tool_calls?.[0];
if (toolCall) {
  // toolCall.function.arguments is automatically sanitized
  const args = JSON.parse(toolCall.function.arguments);
}

Error Handling

import {
  shieldOpenAI,
  InjectionDetectedError,
  LeakDetectedError,
} from "@zeroleaks/shield/openai";

try {
  const client = shieldOpenAI(new OpenAI(), {
    systemPrompt: "You are a helpful assistant.",
    throwOnLeak: true,
  });

  const response = await client.chat.completions.create({
    model: "gpt-5.3-codex",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: userInput },
    ],
  });
} catch (error) {
  if (error instanceof InjectionDetectedError) {
    console.error(`Injection detected: ${error.risk} risk`);
    console.error(`Categories: ${error.categories.join(", ")}`);
  }
  if (error instanceof LeakDetectedError) {
    console.error(`Leak detected: ${error.confidence} confidence`);
    console.error(`Fragments: ${error.fragmentCount}`);
  }
}

Advanced Usage

const client = shieldOpenAI(new OpenAI(), {
  systemPrompt: "You are a helpful assistant.",
  detect: {
    threshold: "high",
    customPatterns: [
      {
        category: "custom_command",
        regex: /execute order \d+/i,
        risk: "high",
      },
    ],
  },
});

Build docs developers (and LLMs) love