Secret Scanning

Vectra Guard includes a comprehensive secret detection engine that finds exposed credentials, API keys, tokens, and private keys in your codebase. The scanner uses pattern-based detection combined with entropy analysis to identify secrets while minimizing false positives.

Quick Start

# Scan current directory
vg scan-secrets --path .

# Scan specific directory
vg scan-secrets --path ./src

# Use allowlist for known test keys
vg scan-secrets --path . --allowlist .secrets-allowlist.txt

The scan-secrets command exits with code 2 when secrets are detected, making it easy to use in CI/CD pipelines to prevent accidental credential commits.

What Secrets Are Detected

Vectra Guard detects secrets using two complementary approaches:

Known Pattern Detection

These patterns are always reported when found:

Pattern ID	Description	Severity	Example
`AWS_ACCESS_KEY_ID`	AWS access key identifier	Critical	`AKIA0123456789ABCDEF`
`AWS_SECRET_ACCESS_KEY`	AWS secret access key	Critical	`aws_secret_access_key = "abcd..."`
`GENERIC_API_KEY`	Generic API keys and tokens	Critical	`api_key = "sk_live_1234..."`
`PRIVATE_KEY_BLOCK`	SSH/RSA/DSA private keys	Critical	`-----BEGIN PRIVATE KEY-----`

Entropy-Based Detection

High-entropy strings (20+ characters, Shannon entropy ≥ 3.5) are flagged as ENTROPY_CANDIDATE only when:

Context is present: The line contains secret-related keywords like token, api_key, secret, password, credential, or auth
Assignment pattern: The line includes = or : suggesting configuration or assignment

Example (detected):

# This WILL be detected - has context + high entropy
api_token = "xY9mK2nP4qR6sT8vW0zA1bC3dE5fG7hJ"

Example (not detected):

# This will NOT be detected - no secret context
slug = "1-eliminating-waterfalls-from-the-user-journey"

False Positive Filters

To reduce noise, the scanner automatically skips:

UUIDs: Standard UUID format (e.g., 550e8400-e29b-41d4-a716-446655440000)
Path-like strings: Contains / (e.g., github.com/org/repo)
Documentation slugs: Numbered slugs (e.g., 1-getting-started)
URL fragments: Contains com/, org/, http, github.
Code identifiers: CamelCase or snake_case with no digits
Lockfiles: package-lock.json, yarn.lock, poetry.lock, go.sum, etc.

Command Reference

Basic Usage

vectra-guard scan-secrets --path <directory>

Options

Flag	Description	Default
`--path`	Directory or file to scan	`.` (current directory)
`--allowlist`	Path to allowlist file	None

Exit Codes

0: No secrets detected
2: Secrets detected (use in CI to fail builds)

Allowlist Configuration

Create a .secrets-allowlist.txt file to exclude known test keys or false positives:

# .secrets-allowlist.txt
# Test API keys (not real secrets)
test_api_key_12345
AKIA00000000EXAMPLETEST

# Known false positives
example_token_for_documentation

Usage:

vg scan-secrets --path . --allowlist .secrets-allowlist.txt

Allowlist entries are matched exactly (after trimming whitespace). Lines starting with # are treated as comments.

Output Format

When secrets are detected, findings are logged with full context:

{
  "level": "warn",
  "msg": "secret detected",
  "file": "src/config.py",
  "line": 42,
  "pattern": "GENERIC_API_KEY",
  "match": "sk_live_abc123...",
  "entropy": "4.52",
  "severity": "critical"
}

Field	Description
`file`	File path:line_number where secret was found
`line`	Line number
`pattern`	Detection pattern ID
`match`	Matched secret value (truncated in logs)
`entropy`	Shannon entropy score
`severity`	`critical`, `high`, or `medium`

CI/CD Integration

GitHub Actions

name: Security Scan

on: [push, pull_request]

jobs:
  scan-secrets:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Install Vectra Guard
        run: |
          curl -fsSL https://raw.githubusercontent.com/xadnavyaai/vectra-guard/main/install.sh | bash
          echo "$HOME/.local/bin" >> $GITHUB_PATH
      
      - name: Scan for secrets
        run: vg scan-secrets --path .

GitLab CI

scan-secrets:
  stage: security
  image: ubuntu:latest
  before_script:
    - apt-get update && apt-get install -y curl
    - curl -fsSL https://raw.githubusercontent.com/xadnavyaai/vectra-guard/main/install.sh | bash
    - export PATH="$HOME/.local/bin:$PATH"
  script:
    - vg scan-secrets --path .
  only:
    - merge_requests
    - main

Pre-commit Hook

Add to .git/hooks/pre-commit:

#!/bin/bash
set -e

echo "🔍 Scanning for secrets..."
vg scan-secrets --path .

if [ $? -eq 2 ]; then
  echo "❌ Secrets detected! Commit blocked."
  echo "   Review findings above or add to .secrets-allowlist.txt"
  exit 1
fi

echo "✅ No secrets detected"

Make it executable:

chmod +x .git/hooks/pre-commit

Pre-commit hooks can be bypassed with git commit --no-verify. For production enforcement, use CI/CD pipeline checks.

Best Practices

1. Scan Before Commit

Run secret scanning as part of your development workflow:

# Before committing changes
vg scan-secrets --path .

2. Use Environment Variables

Never hardcode secrets. Use environment variables instead:

# ❌ Bad - hardcoded secret
api_key = "sk_live_abc123"

# ✅ Good - environment variable
import os
api_key = os.getenv("API_KEY")

3. Rotate Exposed Credentials

If secrets are committed:

Rotate immediately: Treat exposed credentials as compromised
Don’t just delete: Secrets remain in Git history
Use tools like BFG or git-filter-repo: To remove from history
Force push carefully: Coordinate with team

4. Maintain Allowlists

Keep allowlists minimal and documented:

# .secrets-allowlist.txt

# Test fixtures - not real credentials
test_api_key_12345  # Used in test/fixtures/config_test.py

# Documentation examples
example_token_abcd1234  # README.md example

5. Skip Generated Files

The scanner automatically skips common generated files and directories:

.git/, node_modules/, vendor/, .venv/, dist/, build/
Lockfiles: package-lock.json, yarn.lock, poetry.lock, go.sum
Binary files: .png, .jpg, .pdf, .zip, .exe

Examples

Example 1: Clean Repository

$ vg scan-secrets --path .
{"level":"info","msg":"no secrets detected","path":"."}

Example 2: Detected Secrets

$ vg scan-secrets --path ./src
{"level":"warn","msg":"secret detected","file":"src/config.py","line":15,"pattern":"AWS_ACCESS_KEY_ID","match":"AKIA0123456789ABCDEF","entropy":"3.17","severity":"critical"}
{"level":"warn","msg":"secret detected","file":"src/api.js","line":42,"pattern":"GENERIC_API_KEY","match":"sk_live_abc123def456","entropy":"4.23","severity":"critical"}

Exit code: 2 (secrets detected)

Example 3: With Allowlist

# Create allowlist for test keys
$ cat > .secrets-allowlist.txt <<EOF
test_api_key_for_unit_tests
AKIA0000000000EXAMPLE
EOF

# Scan with allowlist
$ vg scan-secrets --path . --allowlist .secrets-allowlist.txt
{"level":"info","msg":"no secrets detected","path":"."}

Troubleshooting

False Positives

Problem: Legitimate code is flagged as a secret. Solution: Add to allowlist or verify it’s not actually a secret:

# Check the entropy and context
vg scan-secrets --path ./file.py

# If it's a false positive, add to allowlist
echo "the_flagged_value" >> .secrets-allowlist.txt

Missed Secrets

Problem: Known secret not detected. Possible causes:

Secret is in a skipped directory (.git/, node_modules/, etc.)
Secret is in a binary or lockfile
Entropy is below threshold (< 3.5) and no known pattern matches

Solution: Report the pattern so it can be added to detectors.

Performance on Large Repos

Problem: Scanning takes too long. Solution: The scanner automatically skips vendor directories and binary files. For very large repos:

# Scan only relevant directories
vg scan-secrets --path ./src
vg scan-secrets --path ./config

vg scan-security: Scan for risky code patterns (command injection, unsafe functions, etc.)
vg audit repo: Comprehensive repository audit (secrets + security + package vulnerabilities)
vg cve scan: Scan dependencies for known vulnerabilities

Implementation Details

Scanner Architecture

The secret scanner (internal/secrets/scanner.go) uses:

Regex-based pattern matching: Known secret formats (AWS keys, private keys, etc.)
Shannon entropy calculation: Measures randomness of strings
Context detection: Looks for assignment patterns and secret-related keywords
False positive filtering: Removes UUIDs, paths, slugs, and code identifiers
Binary file detection: Skips non-UTF-8 files with NUL bytes
Smart file skipping: Ignores vendor dirs, lockfiles, and common build artifacts

Location: /home/daytona/workspace/source/cmd/scan_secrets.go:14
Core logic: /home/daytona/workspace/source/internal/secrets/scanner.go:80

Security Considerations

Important Limitations

Not a silver bullet: Secret scanning detects common patterns but may miss:
- Obfuscated secrets (base64, hex-encoded)
- Secrets split across multiple lines
- API keys with low entropy
- Domain-specific secret formats
Git history: Scanning only checks current files. Secrets in Git history require specialized tools (BFG Repo-Cleaner, git-filter-repo).
Already committed secrets: If secrets were pushed to a remote repository, consider them compromised immediately, even after removal.

Defense in Depth:

Use secret scanning as one layer of security
Combine with secret management tools (Vault, AWS Secrets Manager, etc.)
Implement least-privilege access controls
Rotate credentials regularly
Monitor for unauthorized API usage

Get Started

Core Features

Security

Configuration

Advanced

Integrations

Quick Start

What Secrets Are Detected

Known Pattern Detection

Entropy-Based Detection

Command Reference

Basic Usage

Options

Exit Codes

Allowlist Configuration

Output Format

CI/CD Integration

GitHub Actions

GitLab CI

Pre-commit Hook

Best Practices

1. Scan Before Commit

2. Use Environment Variables

3. Rotate Exposed Credentials

4. Maintain Allowlists

5. Skip Generated Files

Examples

Example 1: Clean Repository

Example 2: Detected Secrets

Example 3: With Allowlist

Troubleshooting

False Positives

Missed Secrets

Performance on Large Repos

Implementation Details

Security Considerations

Build docs developers (and LLMs) love

Get Started

Core Features

Security

Configuration

Advanced

Integrations

​Quick Start

​What Secrets Are Detected

​Known Pattern Detection

​Entropy-Based Detection

​Command Reference

​Basic Usage

​Options

​Exit Codes

​Allowlist Configuration

​Output Format

​CI/CD Integration

​GitHub Actions

​GitLab CI

​Pre-commit Hook

​Best Practices

​1. Scan Before Commit

​2. Use Environment Variables

​3. Rotate Exposed Credentials

​4. Maintain Allowlists

​5. Skip Generated Files

​Examples

​Example 1: Clean Repository

​Example 2: Detected Secrets

​Example 3: With Allowlist

​Troubleshooting

​False Positives

​Missed Secrets

​Performance on Large Repos

​Related Commands

​Implementation Details

​Security Considerations

Build docs developers (and LLMs) love

Quick Start

What Secrets Are Detected

Known Pattern Detection

Entropy-Based Detection

Command Reference

Basic Usage

Options

Exit Codes

Allowlist Configuration

Output Format

CI/CD Integration

GitHub Actions

GitLab CI

Pre-commit Hook

Best Practices

1. Scan Before Commit

2. Use Environment Variables

3. Rotate Exposed Credentials

4. Maintain Allowlists

5. Skip Generated Files

Examples

Example 1: Clean Repository

Example 2: Detected Secrets

Example 3: With Allowlist

Troubleshooting

False Positives

Missed Secrets

Performance on Large Repos

Related Commands

Implementation Details

Security Considerations