Logging - Secure MCP Gateway

Overview

The Secure MCP Gateway implements structured logging with contextual information for comprehensive debugging, auditing, and monitoring. Logs are exported via OpenTelemetry to Loki for aggregation and analysis in Grafana.

Logging Architecture

┌─────────────────────────────────────┐
│ Secure MCP Gateway                  │
│  ├── Lazy Logger (utils.py)        │
│  ├── Structured Context             │
│  └── Log Level Filtering            │
└─────────────┬───────────────────────┘
              │
              ▼
┌─────────────────────────────────────┐
│ OpenTelemetry Provider              │
│  ├── OTLPLogExporter (gRPC/HTTP)   │
│  ├── BatchLogRecordProcessor        │
│  └── LoggerProvider                 │
└─────────────┬───────────────────────┘
              │
              ▼
┌─────────────────────────────────────┐
│ OpenTelemetry Collector             │
│  ├── OTLP Receiver                  │
│  ├── Batch Processor                │
│  └── Loki Exporter (OTLP HTTP)      │
└─────────────┬───────────────────────┘
              │
              ▼
┌─────────────────────────────────────┐
│ Loki                                │
│  ├── TSDB Storage                   │
│  ├── Label Indexing                 │
│  └── Query API                      │
└─────────────┬───────────────────────┘
              │
              ▼
┌─────────────────────────────────────┐
│ Grafana (Loki Datasource)           │
│  ├── LogQL Queries                  │
│  ├── Log Browser                    │
│  └── Live Tail                      │
└─────────────────────────────────────┘

Log Levels

The gateway supports standard Python logging levels:

DEBUG

Log Level

Detailed diagnostic information for troubleshooting. Use sparingly in production.Examples:

Cache lookups
Configuration loading
Detailed request/response data

INFO

Log Level

General operational events. Default level for production.Examples:

Tool execution started/completed
Authentication success
Server discovery

WARNING

Log Level

Unexpected but handled situations that may require attention.Examples:

Cache misses
Slow operations (approaching timeout)
Deprecated API usage

ERROR

Log Level

Error conditions that prevented an operation from completing.Examples:

Tool execution failures
Authentication failures
Guardrail API errors

CRITICAL

Log Level

Severe errors requiring immediate attention.Examples:

Gateway initialization failure
Critical system resource exhaustion
Security breach attempts

Configuring Log Level

Set the log level in enkrypt_mcp_config.json:

{
  "common_mcp_gateway_config": {
    "enkrypt_log_level": "INFO"
  }
}

Options: DEBUG, INFO, WARNING, ERROR, CRITICAL

Structured Logging

Lazy Logger Pattern

The gateway uses a lazy logger to avoid circular imports during initialization: Location: src/secure_mcp_gateway/utils.py:63

class LazyLogger:
    """Lazy logger wrapper used by application modules."""
    
    def __getattr__(self, name):
        logger = get_logger()
        if logger:
            return getattr(logger, name)
        # No-op if logger not available
        return lambda *args, **kwargs: None

logger = LazyLogger()

Using the Logger

Basic Usage:

from secure_mcp_gateway.utils import logger

# Simple log
logger.info("Gateway started")

# With context
logger.info(
    "Tool execution completed",
    extra={
        "server_name": "github_server",
        "tool_name": "create_issue",
        "duration_ms": 250
    }
)

Log Context Structure

The gateway uses the build_log_extra() function to create structured context: Location: src/secure_mcp_gateway/utils.py:352

def build_log_extra(
    ctx,
    custom_id,
    server_name,
    error=None,
    **kwargs
) -> Dict:
    """Build structured log context with all relevant fields."""
    extra = {
        "custom_id": custom_id,
        "server_name": server_name,
    }
    
    # Add gateway config info
    if hasattr(ctx, 'gateway_config') and ctx.gateway_config:
        extra.update({
            "project_id": ctx.gateway_config.get("project_id"),
            "project_name": ctx.gateway_config.get("project_name"),
            "user_id": ctx.gateway_config.get("user_id"),
            "email": ctx.gateway_config.get("email"),
            "mcp_config_id": ctx.gateway_config.get("mcp_config_id"),
        })
    
    # Add error if present
    if error:
        extra["error"] = str(error)
    
    # Add custom fields
    extra.update(kwargs)
    
    return extra

Example Usage:

from secure_mcp_gateway.utils import logger, build_log_extra

extra = build_log_extra(
    ctx=ctx,
    custom_id="abc123_1234567890",
    server_name="github_server",
    tool_name="create_issue",
    duration_ms=250,
    success=True
)

logger.info("Tool executed successfully", extra=extra)

Standard Context Fields

Logs include these standard fields when available:

Field	Type	Description
`custom_id`	string	Request correlation ID (34 chars + timestamp)
`server_name`	string	MCP server name
`tool_name`	string	Tool being executed
`project_id`	string	Project UUID
`project_name`	string	Project name
`user_id`	string	User UUID
`email`	string	User email (masked in sensitive contexts)
`mcp_config_id`	string	Configuration UUID
`duration_ms`	int	Operation duration in milliseconds
`success`	boolean	Operation success status
`error`	string	Error message if failed
`error_type`	string	Error classification

Log Aggregation with Loki

Loki Configuration

Location: infra/loki/loki-config.yaml

auth_enabled: false

server:
  http_listen_port: 3100
  grpc_listen_port: 9096

common:
  path_prefix: /tmp/loki
  storage:
    filesystem:
      chunks_directory: /tmp/loki/chunks
      rules_directory: /tmp/loki/rules
  replication_factor: 1
  ring:
    instance_addr: 127.0.0.1
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2020-10-24
      store: tsdb
      object_store: filesystem
      schema: v13
      index:
        prefix: index_
        period: 24h

limits_config:
  allow_structured_metadata: true
  reject_old_samples: true
  reject_old_samples_max_age: 168h

Collector Export to Loki

Location: infra/otel_collector/otel-collector-config.yaml

exporters:
  otlphttp/loki:
    endpoint: "http://loki:3100/otlp"
    tls:
      insecure: true

service:
  pipelines:
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlphttp/loki, debug]

Accessing Loki

API: http://localhost:3100
Ready Check: http://localhost:3100/ready
Metrics: http://localhost:3100/metrics

Querying Logs in Grafana

LogQL Basics

Loki uses LogQL for querying logs:

# All logs from gateway
{service_name="secure-mcp-gateway"}

# Filter by log level
{service_name="secure-mcp-gateway"} |= "level=ERROR"

# Filter by server name
{service_name="secure-mcp-gateway"} | json | server_name="github_server"

# Filter by tool name
{service_name="secure-mcp-gateway"} | json | tool_name="create_issue"

# Search for errors
{service_name="secure-mcp-gateway"} |~ "(?i)error|exception|failed"

Advanced Queries

# Tool execution logs with duration > 1s
{service_name="secure-mcp-gateway"} 
  | json 
  | duration_ms > 1000

# Authentication failures
{service_name="secure-mcp-gateway"} 
  | json 
  | level="ERROR" 
  | message=~".*authentication.*failed.*"

# Guardrail violations
{service_name="secure-mcp-gateway"} 
  | json 
  | message=~".*guardrail.*violation.*"

# Logs for specific user
{service_name="secure-mcp-gateway"} 
  | json 
  | user_id="user-123-456"

# Rate of errors per minute
sum(rate(
  {service_name="secure-mcp-gateway"} |= "level=ERROR" [1m]
))

Accessing Grafana Explore

Open Grafana: http://localhost:3000
Navigate to Explore (compass icon)
Select Loki datasource
Enter LogQL query
Click “Run query”

Live Tail

View logs in real-time:

Grafana → Explore
Select Loki
Click “Live” button
Enter query: {service_name="secure-mcp-gateway"}
Logs stream in real-time

Log Format Examples

Tool Execution Log

{
  "timestamp": "2026-03-04T07:15:23.123Z",
  "level": "INFO",
  "message": "Tool executed successfully",
  "service_name": "secure-mcp-gateway",
  "custom_id": "abc123xyz789_1709533523",
  "server_name": "github_server",
  "tool_name": "create_issue",
  "project_id": "proj-123-456",
  "project_name": "MyProject",
  "user_id": "user-789-012",
  "email": "user@example.com",
  "mcp_config_id": "config-345-678",
  "duration_ms": 250,
  "success": true
}

Guardrail Violation Log

{
  "timestamp": "2026-03-04T07:15:24.456Z",
  "level": "WARNING",
  "message": "Input guardrail violation detected",
  "service_name": "secure-mcp-gateway",
  "custom_id": "def456uvw012_1709533524",
  "server_name": "github_server",
  "tool_name": "delete_repo",
  "violation_type": "policy_violation",
  "detector": "policy_detector",
  "blocked": true,
  "project_id": "proj-123-456",
  "user_id": "user-789-012"
}

Error Log

{
  "timestamp": "2026-03-04T07:15:25.789Z",
  "level": "ERROR",
  "message": "Tool execution failed",
  "service_name": "secure-mcp-gateway",
  "custom_id": "ghi789rst345_1709533525",
  "server_name": "github_server",
  "tool_name": "create_issue",
  "error": "Connection timeout after 30s",
  "error_type": "TimeoutError",
  "duration_ms": 30001,
  "success": false,
  "project_id": "proj-123-456",
  "user_id": "user-789-012"
}

Logging Best Practices

Always Include Context

Use structured logging with contextual fields:

# Good
logger.info(
    "Tool executed",
    extra={
        "server_name": server_name,
        "tool_name": tool_name,
        "duration_ms": duration
    }
)

# Bad
logger.info(f"Tool {tool_name} on {server_name} took {duration}ms")

Structured logs enable powerful filtering and analysis.

Use Appropriate Log Levels

DEBUG: Detailed diagnostic info (cache lookups, config loading)
INFO: Normal operations (tool execution, auth success)
WARNING: Unexpected but handled (cache miss, slow operation)
ERROR: Operation failures (tool error, auth failure)
CRITICAL: Severe errors (gateway crash, security breach)

Mask Sensitive Data

Always mask sensitive information:

from secure_mcp_gateway.utils import mask_sensitive_data

# Mask before logging
safe_data = mask_sensitive_data({
    "api_key": "secret123",
    "password": "pass456"
})
logger.info("Config loaded", extra=safe_data)

The mask_sensitive_data function masks keys like: token, key, secret, password, auth, etc.

Use Correlation IDs

Always include custom_id for request tracing:

from secure_mcp_gateway.utils import generate_custom_id

custom_id = generate_custom_id()  # "abc123xyz789_1709533523"

logger.info("Request started", extra={"custom_id": custom_id})
# ... operations ...
logger.info("Request completed", extra={"custom_id": custom_id})

This enables tracking requests across all logs.

Log at Decision Points

Log important decisions and branches:

if guardrail_result.blocked:
    logger.warning(
        "Tool call blocked by guardrail",
        extra={
            "server_name": server_name,
            "tool_name": tool_name,
            "reason": guardrail_result.reason
        }
    )
else:
    logger.info("Tool call allowed, executing...")

Include Timing Information

Log operation durations:

import time

start_time = time.time()
# ... operation ...
duration_ms = int((time.time() - start_time) * 1000)

logger.info(
    "Operation completed",
    extra={"duration_ms": duration_ms}
)

Log Retention and Management

Retention Configuration

Configure retention in Loki:

limits_config:
  retention_period: 168h  # 7 days
  reject_old_samples: true
  reject_old_samples_max_age: 168h

Compaction

Loki automatically compacts old chunks to save space. Configure in loki-config.yaml:

compactor:
  working_directory: /tmp/loki/compactor
  shared_store: filesystem
  compaction_interval: 10m

Log Volume Management

Reduce log volume:

Increase log level: Use WARNING or ERROR in production
Sample logs: Log only a percentage of requests
Filter before export: Use collector processors to filter low-value logs

Troubleshooting

Logs Not Appearing in Loki

Check Loki is running:
```
curl http://localhost:3100/ready
```
Verify collector exports to Loki:
```
docker logs otel-collector | grep loki
```

Check gateway logs are being exported:

docker logs otel-collector | grep "logs"

Test Loki API:

curl -G -s "http://localhost:3100/loki/api/v1/query" \
  --data-urlencode 'query={service_name="secure-mcp-gateway"}'

Logs Not Structured

Symptom: Logs appear as plain text instead of JSON Cause: Not using extra parameter Solution:

# Before
logger.info(f"Tool {tool_name} executed")  # ❌

# After
logger.info("Tool executed", extra={"tool_name": tool_name})  # ✅

High Log Volume

Symptom: Excessive disk usage, slow queries Solutions:

Increase log level to WARNING
Reduce DEBUG logs in production
Configure log sampling
Reduce retention period

Cannot Query by Field

Symptom: LogQL queries by field don’t work Cause: Need to parse JSON Solution:

# Before
{service_name="secure-mcp-gateway"} | server_name="github"  # ❌

# After
{service_name="secure-mcp-gateway"} | json | server_name="github"  # ✅

Next Steps

Metrics

Explore Prometheus metrics and Grafana dashboards

OpenTelemetry Setup

Configure OTLP export and distributed tracing

Overview

Return to observability overview

Troubleshooting

Common issues and solutions

Get Started

Core Concepts

Features

Deployment

Client Integration

Observability

Security

Guides

Documentation Index

​Overview

​Logging Architecture

​Log Levels

​Configuring Log Level

​Structured Logging

​Lazy Logger Pattern

​Using the Logger

​Log Context Structure

​Standard Context Fields

​Log Aggregation with Loki

​Loki Configuration

​Collector Export to Loki

​Accessing Loki

​Querying Logs in Grafana

​LogQL Basics

​Advanced Queries

​Accessing Grafana Explore

​Live Tail

​Log Format Examples

​Tool Execution Log

​Guardrail Violation Log

​Error Log

​Logging Best Practices

​Log Retention and Management

​Retention Configuration

​Compaction

​Log Volume Management

​Troubleshooting

​Logs Not Appearing in Loki

​Logs Not Structured

​High Log Volume

​Cannot Query by Field

​Next Steps

Metrics

OpenTelemetry Setup

Overview

Troubleshooting

Build docs developers (and LLMs) love

Overview

Logging Architecture

Log Levels

Configuring Log Level

Structured Logging

Lazy Logger Pattern

Using the Logger

Log Context Structure

Standard Context Fields

Log Aggregation with Loki

Loki Configuration

Collector Export to Loki

Accessing Loki

Querying Logs in Grafana

LogQL Basics

Advanced Queries

Accessing Grafana Explore

Live Tail

Log Format Examples

Tool Execution Log

Guardrail Violation Log

Error Log

Logging Best Practices

Log Retention and Management

Retention Configuration

Compaction

Log Volume Management

Troubleshooting

Logs Not Appearing in Loki

Logs Not Structured

High Log Volume

Cannot Query by Field

Next Steps