Secret Redaction

Overview

EchoVault implements a comprehensive 3-layer redaction system to ensure API keys, passwords, and other sensitive data never make it into your memory vault.

Redaction happens before anything is written to disk or stored in the database. Once redacted, secrets cannot be recovered.

The Three Layers

From ~/workspace/source/src/memory/redaction.py:1-9, the system uses three progressive layers of protection:

Layer 1: Explicit Tags

User-marked sensitive content wrapped in <redacted> tags.

Layer 2: Pattern Detection

Automatic detection of known secret formats (API keys, tokens, passwords).

Layer 3: Custom Rules

Project-specific patterns defined in .memoryignore file.

Layer 1: Explicit Redaction Tags

Wrap sensitive content in <redacted> tags to explicitly mark it for removal:

memory save \
  --title "OpenAI integration setup" \
  --what "Configured OpenAI API with key <redacted>sk-proj-abc123</redacted>" \
  --category "context"

Stored as:

Configured OpenAI API with key [REDACTED]

Tag Handling

From ~/workspace/source/src/memory/redaction.py:49-59, the implementation handles nested and multiline tags:

# Layer 1: Explicit <redacted> tags
while True:
    prev_text = text
    text = REDACTED_TAG_PATTERN.sub("[REDACTED]", text)
    if prev_text == text:
        break

# Clean up any remaining orphaned tags
text = text.replace("<redacted>", "").replace("</redacted>", "")

Use explicit tags when you need to reference a secret in context but don’t want it stored. Example:

"Used production API key <redacted>pk_live_xyz</redacted> to test webhook delivery"

Layer 2: Automatic Pattern Detection

EchoVault automatically detects and redacts common secret patterns:

Supported Patterns

From ~/workspace/source/src/memory/redaction.py:14-26, the built-in patterns include:

SENSITIVE_PATTERNS = [
    r"sk_live_[a-zA-Z0-9]+",                      # Stripe live keys
    r"sk_test_[a-zA-Z0-9]+",                      # Stripe test keys
    r"ghp_[a-zA-Z0-9]+",                          # GitHub personal access tokens
    r"AKIA[0-9A-Z]{16}",                          # AWS access key IDs
    r"xoxb-[a-zA-Z0-9-]+",                        # Slack bot tokens
    r"-----BEGIN (?:RSA )?PRIVATE KEY-----",      # Private keys (RSA and generic)
    r"eyJ[a-zA-Z0-9_-]+\.eyJ[a-zA-Z0-9_-]+",     # JWT tokens
    r"password\s*[:=]\s*[\"']?.+",                # Password fields
    r"secret\s*[:=]\s*[\"']?.+",                  # Secret fields
    r"api[_-]?key\s*[:=]\s*[\"']?.+",            # API key fields
]

All patterns are case-insensitive.

Pattern Examples

Before redaction:

API_KEY=sk_live_abc123def456
password: "my-secret-pass"
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIn0
SLACK_BOT_TOKEN=xoxb-123456789-abcdef
AWS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE

After redaction:

API_KEY=[REDACTED]
[REDACTED]
Authorization: Bearer [REDACTED]
SLACK_BOT_TOKEN=[REDACTED]
AWS_ACCESS_KEY_ID=[REDACTED]

Pattern detection is automatic and always active. You don’t need to configure anything - it just works.

Layer 3: Custom Patterns (.memoryignore)

For project-specific sensitive data, create a .memoryignore file:

# Location: ~/.memory/.memoryignore

# SSN pattern
\d{3}-\d{2}-\d{4}

# Internal employee IDs
EMP-\d{6}

# Custom API key format
app_key_[a-f0-9]{32}

# Email addresses (if you want to redact them)
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}

File Format

From ~/workspace/source/src/memory/redaction.py:69-105, the parser supports:

One regex pattern per line
Comments starting with #
Empty lines (ignored)
Raw regex syntax (no quotes or delimiters needed)

def load_memoryignore(path: str) -> list[str]:
    try:
        with open(path) as f:
            lines = f.readlines()
    except FileNotFoundError:
        return []
    
    patterns = []
    for line in lines:
        line = line.strip()
        if line and not line.startswith("#"):
            patterns.append(line)
    
    return patterns

Pattern Testing

Test your patterns before deploying:

from memory.redaction import redact, load_memoryignore

patterns = load_memoryignore(".memoryignore")
test_text = "Employee EMP-123456 accessed the system"
result = redact(test_text, patterns)
print(result)  # "Employee [REDACTED] accessed the system"

Be careful with overly broad patterns. For example, \d+ would redact ALL numbers including line numbers, timestamps, and counts.

Redaction Pipeline

When you save a memory, all text fields are redacted: From ~/workspace/source/src/memory/core.py:211-218:

# Redact all text fields
raw.what = redact(raw.what, self.ignore_patterns)
if raw.why:
    raw.why = redact(raw.why, self.ignore_patterns)
if raw.impact:
    raw.impact = redact(raw.impact, self.ignore_patterns)
if raw.details:
    raw.details = redact(raw.details, self.ignore_patterns)

Fields that get redacted:

what (always)
why (if present)
impact (if present)
details (if present)

Fields that are NOT redacted:

title (for better search UX)
tags (assumed to be safe metadata)
category (enum value)
related_files (file paths)

If you need to redact titles or tags, wrap them in <redacted> tags explicitly.

Best Practices

Use Explicit Tags

When you know something is sensitive, wrap it in <redacted> tags immediately. Don’t rely solely on pattern detection.

Test Your Patterns

Before adding patterns to .memoryignore, test them with sample data to avoid over-redaction.

Avoid Secrets Entirely

The best practice is to not include secrets in memory saves at all. Reference them indirectly when possible.

Review Session Files

Periodically review files in ~/.memory/vault/ to ensure no secrets slipped through.

Examples

Good: Redacted Reference

memory save \
  --title "Configured Stripe integration" \
  --what "Added Stripe publishable key to frontend config" \
  --why "Needed for checkout flow" \
  --impact "Checkout now processes payments" \
  --details "Used test key <redacted>pk_test_abc</redacted> for staging"

Bad: Secret in Title

memory save \
  --title "Added API key sk_live_12345" \
  --what "Configured production API"

The secret is in the title, which is not redacted by default.

Better: Generic Title

memory save \
  --title "Configured production API" \
  --what "Added API key <redacted>sk_live_12345</redacted> to .env"

Redaction Guarantees

What is guaranteed:

All known secret patterns are redacted before database insert
All explicit <redacted> tags are replaced with [REDACTED]
Custom .memoryignore patterns are applied to all text fields
Redaction happens before markdown file write

What is NOT guaranteed:

Secrets with unknown formats that don’t match any pattern
Secrets in unusual encodings (base64, hex) that don’t match JWT pattern
Secrets in file paths or code structure (these require explicit tags)

Recovery

Once content is redacted, it cannot be recovered. The original text is never stored. If you accidentally redacted something important:

Check your shell history for the original command
Look in agent logs (if the agent logged the original content)
Recreate the memory with corrected content

For non-secret but important data that might match patterns (like example API keys in documentation), use explicit tags to control redaction:

Example API key format: <redacted>sk_test_example123</redacted> (not a real key)

Get Started

Core Concepts

Usage Guide

Agent Setup

Configuration

Resources

Overview

The Three Layers

Layer 1: Explicit Redaction Tags

Tag Handling

Layer 2: Automatic Pattern Detection

Layer 3: Custom Patterns (.memoryignore)

File Format

Pattern Testing

Redaction Pipeline

Best Practices

Use Explicit Tags

Test Your Patterns

Avoid Secrets Entirely

Review Session Files

Examples

Good: Redacted Reference

Bad: Secret in Title

Better: Generic Title

Redaction Guarantees

Recovery

Build docs developers (and LLMs) love

Get Started

Core Concepts

Usage Guide

Agent Setup

Configuration

Resources

​Overview

​The Three Layers

​Layer 1: Explicit Redaction Tags

​Tag Handling

​Layer 2: Automatic Pattern Detection

​Layer 3: Custom Patterns (.memoryignore)

​File Format

​Pattern Testing

​Redaction Pipeline

​Best Practices

Use Explicit Tags

Test Your Patterns

Avoid Secrets Entirely

Review Session Files

​Examples

​Good: Redacted Reference

​Bad: Secret in Title

​Better: Generic Title

​Redaction Guarantees

​Recovery

Build docs developers (and LLMs) love

Overview

The Three Layers

Layer 1: Explicit Redaction Tags

Tag Handling

Layer 2: Automatic Pattern Detection

Layer 3: Custom Patterns (.memoryignore)

File Format

Pattern Testing

Redaction Pipeline

Best Practices

Examples

Good: Redacted Reference

Bad: Secret in Title

Better: Generic Title

Redaction Guarantees

Recovery