OpenAI Provider

The shieldOpenAI wrapper protects OpenAI clients by hardening system prompts, detecting injections in user input, and sanitizing model output to prevent leaks.

Installation

npm install @zeroleaks/shield openai

Quick Start

import OpenAI from "openai";
import { shieldOpenAI } from "@zeroleaks/shield/openai";

const client = shieldOpenAI(new OpenAI(), {
  systemPrompt: "You are a financial advisor...",
  onDetection: "block", // throws on injection (default)
});

const response = await client.chat.completions.create({
  model: "gpt-5.3-codex",
  messages: [
    { role: "system", content: "You are a financial advisor..." },
    { role: "user", content: userInput },
  ],
});

Configuration Options

Basic Options

systemPrompt

string

System prompt for sanitization. When omitted, derived from the first system message in the request.

onDetection

'block' | 'warn'

default:"block"

"block": Throws InjectionDetectedError when injection is detected
"warn": Only invokes onInjectionDetected callback without blocking

throwOnLeak

boolean

default:false

When true, throws LeakDetectedError instead of redacting leaked content.

Feature Flags

harden

HardenOptions | false

Options for system prompt hardening. Set to false to disable hardening entirely.

detect

DetectOptions | false

Options for injection detection. Set to false to disable detection entirely.

sanitize

SanitizeOptions | false

Options for output sanitization. Set to false to disable sanitization entirely.

Streaming Options

streamingSanitize

'buffer' | 'chunked' | 'passthrough'

default:"buffer"

Controls how streaming responses are sanitized:

"buffer": Accumulates full response then sanitizes (higher memory, more accurate)
"chunked": Sanitizes in 8KB chunks (lower memory for long streams)
"passthrough": Skip sanitization for streams (use when you accept the risk)

streamingChunkSize

number

default:8192

Chunk size in bytes for "chunked" mode.

Callbacks

onInjectionDetected

(result: DetectResult) => void

Invoked when injection is detected. Receives detection result with risk level and matched patterns.

onLeakDetected

(result: SanitizeResult) => void

Invoked when a prompt leak is detected in the output. Receives sanitization result with confidence score.

Streaming Support

Shield automatically handles both regular and streaming responses:

Regular Response
Streaming Response
Chunked Sanitization

const response = await client.chat.completions.create({
  model: "gpt-5.3-codex",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: userInput },
  ],
});

console.log(response.choices[0].message.content);
// Automatically sanitized

const stream = await client.chat.completions.create({
  model: "gpt-5.3-codex",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: userInput },
  ],
  stream: true,
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) {
    process.stdout.write(content);
    // Automatically sanitized chunks
  }
}

// For very long streams, use chunked mode to limit memory
const client = shieldOpenAI(new OpenAI(), {
  systemPrompt: "You are a helpful assistant.",
  streamingSanitize: "chunked",
  streamingChunkSize: 8192, // 8KB chunks
});

const stream = await client.chat.completions.create({
  model: "gpt-5.3-codex",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: userInput },
  ],
  stream: true,
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) {
    process.stdout.write(content);
  }
}

Multi-part Messages

Shield supports multi-part messages with text and images. Text content is extracted from all parts for injection detection and hardening:

const response = await client.chat.completions.create({
  model: "gpt-4-vision-preview",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    {
      role: "user",
      content: [
        { type: "text", text: "What's in this image?" },
        { type: "image_url", image_url: { url: "https://..." } },
      ],
    },
  ],
});
// Text parts are scanned for injection, images are passed through

Tool Calls

Shield automatically sanitizes tool call arguments to prevent prompt leakage through function parameters:

const response = await client.chat.completions.create({
  model: "gpt-5.3-codex",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: userInput },
  ],
  tools: [
    {
      type: "function",
      function: {
        name: "send_email",
        description: "Send an email",
        parameters: {
          type: "object",
          properties: {
            to: { type: "string" },
            subject: { type: "string" },
            body: { type: "string" },
          },
        },
      },
    },
  ],
});

const toolCall = response.choices[0].message.tool_calls?.[0];
if (toolCall) {
  // toolCall.function.arguments is automatically sanitized
  const args = JSON.parse(toolCall.function.arguments);
}

Error Handling

import {
  shieldOpenAI,
  InjectionDetectedError,
  LeakDetectedError,
} from "@zeroleaks/shield/openai";

try {
  const client = shieldOpenAI(new OpenAI(), {
    systemPrompt: "You are a helpful assistant.",
    throwOnLeak: true,
  });

  const response = await client.chat.completions.create({
    model: "gpt-5.3-codex",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: userInput },
    ],
  });
} catch (error) {
  if (error instanceof InjectionDetectedError) {
    console.error(`Injection detected: ${error.risk} risk`);
    console.error(`Categories: ${error.categories.join(", ")}`);
  }
  if (error instanceof LeakDetectedError) {
    console.error(`Leak detected: ${error.confidence} confidence`);
    console.error(`Fragments: ${error.fragmentCount}`);
  }
}

Advanced Usage

Custom Detection Patterns
Custom Hardening Rules
Monitoring & Logging

const client = shieldOpenAI(new OpenAI(), {
  systemPrompt: "You are a helpful assistant.",
  detect: {
    threshold: "high",
    customPatterns: [
      {
        category: "custom_command",
        regex: /execute order \d+/i,
        risk: "high",
      },
    ],
  },
});

const client = shieldOpenAI(new OpenAI(), {
  systemPrompt: "You are a financial advisor.",
  harden: {
    customRules: [
      "Never share specific investment recommendations.",
      "Always include risk disclaimers.",
    ],
    position: "prepend",
  },
});

const client = shieldOpenAI(new OpenAI(), {
  systemPrompt: "You are a helpful assistant.",
  onInjectionDetected: (result) => {
    console.warn(`Injection attempt blocked:`, {
      risk: result.risk,
      categories: result.matches.map((m) => m.category),
      timestamp: new Date().toISOString(),
    });
  },
  onLeakDetected: (result) => {
    console.warn(`Prompt leak detected:`, {
      confidence: result.confidence,
      fragmentCount: result.fragments.length,
      timestamp: new Date().toISOString(),
    });
  },
});

Core API - Standalone functions for custom integrations
Anthropic Provider - Anthropic SDK integration
Groq Provider - Groq SDK integration
Vercel AI SDK - Universal AI SDK integration

Get Started

Core Functions

Provider Integrations

Advanced

OpenAI

OpenAI Provider

Installation

Quick Start

Configuration Options

Basic Options

Feature Flags

Streaming Options

Callbacks

Streaming Support

Multi-part Messages

Tool Calls

Error Handling

Advanced Usage

Build docs developers (and LLMs) love

Get Started

Core Functions

Provider Integrations

Advanced

​OpenAI Provider

​Installation

​Quick Start

​Configuration Options

​Basic Options

​Feature Flags

​Streaming Options

​Callbacks

​Streaming Support

​Multi-part Messages

​Tool Calls

​Error Handling

​Advanced Usage

​Related

Build docs developers (and LLMs) love

OpenAI Provider

Installation

Quick Start

Configuration Options

Basic Options

Feature Flags

Streaming Options

Callbacks

Streaming Support

Multi-part Messages

Tool Calls

Error Handling

Advanced Usage

Related