S3-Hosted Databases - MCP Server MotherDuck

Connecting to S3 Databases

DuckDB can directly query databases stored in S3, providing cloud-native analytics without downloading files.

Basic S3 Connection

{
  "mcpServers": {
    "DuckDB (S3)": {
      "command": "uvx",
      "args": [
        "mcp-server-motherduck",
        "--db-path", "s3://your-bucket/path/to/database.duckdb"
      ],
      "env": {
        "AWS_ACCESS_KEY_ID": "your_access_key",
        "AWS_SECRET_ACCESS_KEY": "your_secret_key",
        "AWS_DEFAULT_REGION": "us-east-1"
      }
    }
  }
}

S3 databases are automatically attached as read-only to prevent accidental modifications to cloud storage.

AWS Credential Configuration

Static Credentials

Provide AWS access keys directly:

{
  "env": {
    "AWS_ACCESS_KEY_ID": "AKIAIOSFODNN7EXAMPLE",
    "AWS_SECRET_ACCESS_KEY": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
    "AWS_DEFAULT_REGION": "us-east-1"
  }
}

IAM Roles and Session Tokens

For temporary credentials from IAM roles, SSO, or EC2 instance profiles:

{
  "env": {
    "AWS_SESSION_TOKEN": "FwoGZXIvYXdzEBYaD...",
    "AWS_DEFAULT_REGION": "us-west-2"
  }
}

When AWS_SESSION_TOKEN is present, DuckDB uses the credential_chain provider to automatically fetch credentials from:

Environment variables
AWS configuration files (~/.aws/credentials)
IAM instance profiles (EC2)
AWS SSO
IAM roles

Environment Variables Reference

Variable	Required	Description
`AWS_ACCESS_KEY_ID`	For static credentials	AWS access key
`AWS_SECRET_ACCESS_KEY`	For static credentials	AWS secret key
`AWS_SESSION_TOKEN`	For temporary credentials	Session token for IAM roles
`AWS_DEFAULT_REGION`	Recommended	AWS region (defaults to `us-east-1`)

Querying S3 Databases

Basic Queries

{
  "name": "execute_query",
  "arguments": {
    "sql": "SELECT 1 as num"
  }
}

List Tables in S3 Database

{
  "name": "list_tables",
  "arguments": {
    "database": "s3db"
  }
}

Query S3 Database Tables

SELECT 
  product_id,
  SUM(quantity) as total_quantity,
  SUM(amount) as total_revenue
FROM s3db.main.sales
WHERE sale_date >= '2024-01-01'
GROUP BY product_id
ORDER BY total_revenue DESC
LIMIT 10

Read-Only Operations

S3 databases are attached with READ_ONLY flag to protect cloud data.

Allowed Operations

SELECT * FROM s3db.main.sales WHERE region = 'US';

Blocked Operations

Write operations will fail:

{
  "name": "execute_query",
  "arguments": {
    "sql": "CREATE TABLE s3db.main.new_table (id INT)"
  }
}

INSERT, UPDATE, DELETE, and CREATE operations are not supported on S3 databases to prevent accidental modifications.

S3 Path Formats

Standard S3 Path

s3://bucket-name/path/to/database.duckdb

Examples

s3://my-analytics-bucket/database.duckdb

Attaching Multiple S3 Databases

Query multiple S3 databases in a single session:

Switch Between S3 Databases

Requires --allow-switch-databases flag.

{
  "command": "uvx",
  "args": [
    "mcp-server-motherduck",
    "--db-path", "s3://bucket1/db1.duckdb",
    "--allow-switch-databases"
  ]
}

Attach Additional S3 Databases

Without switching, attach additional databases:

ATTACH 's3://bucket2/analytics.duckdb' AS analytics_db (READ_ONLY);
ATTACH 's3://bucket3/warehouse.duckdb' AS warehouse_db (READ_ONLY);

Query Across Multiple S3 Databases

SELECT 
  'sales' as source,
  COUNT(*) as record_count
FROM s3db.main.transactions

UNION ALL

SELECT 
  'analytics' as source,
  COUNT(*) as record_count
FROM analytics_db.main.events

UNION ALL

SELECT 
  'warehouse' as source,
  COUNT(*) as record_count
FROM warehouse_db.main.inventory

Querying Other S3 Data Formats

Besides DuckDB databases, query raw data files directly from S3:

S3 Parquet Files

SELECT * FROM read_parquet('s3://my-bucket/data/*.parquet')
WHERE date >= '2024-01-01'
LIMIT 1000;

S3 CSV Files

SELECT * FROM read_csv_auto('s3://my-bucket/exports/data.csv')
LIMIT 100;

S3 JSON Files

SELECT * FROM read_json_auto('s3://my-bucket/logs/*.json')
WHERE event_type = 'error'
ORDER BY timestamp DESC
LIMIT 50;

Querying raw S3 files (Parquet, CSV, JSON) doesn’t require the database to exist in S3 - DuckDB reads them directly with its httpfs extension.

Hybrid S3 + Local Queries

Combine S3 data with local processing:

Export S3 Data to Local Database

-- Create local copy of S3 data
CREATE TABLE local_sales AS 
SELECT * FROM s3db.main.sales 
WHERE sale_date >= '2024-01-01';

Join S3 and Local Data

-- Attach local database alongside S3
ATTACH '/local/path/customer_db.duckdb' AS local_db (READ_ONLY);

-- Join across S3 and local
SELECT 
  s.order_id,
  s.amount,
  c.customer_name,
  c.segment
FROM s3db.main.sales s
JOIN local_db.main.customers c ON s.customer_id = c.id
WHERE s.sale_date >= CURRENT_DATE - INTERVAL 30 DAYS;

Aggregate S3 Data Locally

-- Process large S3 dataset, store results locally  
CREATE TABLE daily_summary AS
SELECT 
  DATE_TRUNC('day', sale_timestamp) as sale_date,
  region,
  COUNT(*) as order_count,
  SUM(amount) as total_revenue
FROM s3db.main.sales
GROUP BY sale_date, region
ORDER BY sale_date DESC;

Performance Considerations

Minimize Data Transfer

Use WHERE Clauses

Filter data in S3 before transferring to reduce network usage.

Aggregate in S3

Use GROUP BY and aggregations to summarize before downloading.

LIMIT Results

Always use LIMIT for exploratory queries.

Column Projection

Select only needed columns, not SELECT *.

Example: Optimized S3 Query

SELECT * FROM s3db.main.large_table;

Connection Configuration

The S3 connection is managed internally by the database client:

Create In-Memory Connection

When connecting to an S3 database, DuckDB creates an in-memory connection first.

Load httpfs Extension

The httpfs extension is installed and loaded for S3 support.

Configure Credentials

AWS credentials are configured using CREATE SECRET:

CREATE SECRET IF NOT EXISTS s3_secret (
  TYPE S3,
  KEY_ID 'your_access_key',
  SECRET 'your_secret_key',
  REGION 'us-east-1'
);

Attach S3 Database

The S3 database is attached as read-only:

ATTACH 's3://bucket/path/db.duckdb' AS s3db (READ_ONLY);
USE s3db;

Credential Chain Provider

When AWS_SESSION_TOKEN is present, DuckDB uses credential chain:

CREATE SECRET IF NOT EXISTS s3_secret (
  TYPE S3,
  PROVIDER credential_chain,
  REGION 'us-east-1'
);

This automatically fetches credentials from:

Environment variables
AWS configuration files
IAM instance profiles (EC2)
IAM roles (ECS, Lambda)
AWS SSO sessions

Real-World Examples

Analytics on S3 Data Lake

{
  "mcpServers": {
    "S3 Data Lake": {
      "command": "uvx",
      "args": [
        "mcp-server-motherduck",
        "--db-path", "s3://company-datalake/analytics/production.duckdb",
        "--max-rows", "10000"
      ],
      "env": {
        "AWS_SESSION_TOKEN": "...",
        "AWS_DEFAULT_REGION": "us-west-2"
      }
    }
  }
}

Multi-Region S3 Databases

{
  "mcpServers": {
    "Multi-Region Analytics": {
      "command": "uvx",
      "args": [
        "mcp-server-motherduck",
        "--db-path", "s3://us-data/analytics.duckdb",
        "--allow-switch-databases"
      ],
      "env": {
        "AWS_ACCESS_KEY_ID": "...",
        "AWS_SECRET_ACCESS_KEY": "...",
        "AWS_DEFAULT_REGION": "us-east-1"
      }
    }
  }
}

Query multiple regions:

-- US data
SELECT 'US' as region, COUNT(*) FROM s3db.main.users;

-- Switch to EU database
-- Use switch_database_connection to s3://eu-data/analytics.duckdb

-- EU data
SELECT 'EU' as region, COUNT(*) FROM s3db.main.users;

Troubleshooting

Authentication Errors

Error: “Failed to attach S3 database: Unable to connect to S3” Solution: Verify AWS credentials are set correctly:

# Test credentials
aws s3 ls s3://your-bucket/

# Check environment
echo $AWS_ACCESS_KEY_ID
echo $AWS_DEFAULT_REGION

Database Not Found

Error: “database does not exist” Solution:

Verify the S3 path is correct
Check bucket permissions allow s3:GetObject
Ensure the file exists:

aws s3 ls s3://bucket/path/database.duckdb

Permission Denied

Error: “Access Denied” or “403 Forbidden” Solution: Ensure your IAM user/role has required permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::your-bucket/*",
        "arn:aws:s3:::your-bucket"
      ]
    }
  ]
}

Write Operation Blocked

Error: “Cannot execute statement of type INSERT/CREATE in read-only mode” Solution: This is expected - S3 databases are read-only. To modify data:

Copy data to local database
Make modifications locally
Upload modified database back to S3 (outside of MCP)

Region Mismatch

Error: Slow queries or connection timeouts Solution: Ensure AWS_DEFAULT_REGION matches your bucket’s region:

{
  "env": {
    "AWS_DEFAULT_REGION": "us-west-2"  // Match bucket region
  }
}

Get Started

Configuration

Guides

Examples

​Connecting to S3 Databases

​Basic S3 Connection

​AWS Credential Configuration

​Static Credentials

​IAM Roles and Session Tokens

​Environment Variables Reference

​Querying S3 Databases

​Basic Queries

​List Tables in S3 Database

​Query S3 Database Tables

​Read-Only Operations

​Allowed Operations

​Blocked Operations

​S3 Path Formats

​Standard S3 Path

​Examples

​Attaching Multiple S3 Databases

​Switch Between S3 Databases

​Attach Additional S3 Databases

​Query Across Multiple S3 Databases

​Querying Other S3 Data Formats

​S3 Parquet Files

​S3 CSV Files

​S3 JSON Files

​Hybrid S3 + Local Queries

​Export S3 Data to Local Database

​Join S3 and Local Data

​Aggregate S3 Data Locally

​Performance Considerations

​Minimize Data Transfer

Use WHERE Clauses

Aggregate in S3

LIMIT Results

Column Projection

​Example: Optimized S3 Query

​Connection Configuration

​Credential Chain Provider

​Real-World Examples

​Analytics on S3 Data Lake

​Multi-Region S3 Databases

​Troubleshooting

​Authentication Errors

​Database Not Found

​Permission Denied

​Write Operation Blocked

​Region Mismatch

Build docs developers (and LLMs) love

Connecting to S3 Databases

Basic S3 Connection

AWS Credential Configuration

Static Credentials

IAM Roles and Session Tokens

Environment Variables Reference

Querying S3 Databases

Basic Queries

List Tables in S3 Database

Query S3 Database Tables

Read-Only Operations

Allowed Operations

Blocked Operations

S3 Path Formats

Standard S3 Path

Examples

Attaching Multiple S3 Databases

Switch Between S3 Databases

Attach Additional S3 Databases

Query Across Multiple S3 Databases

Querying Other S3 Data Formats

S3 Parquet Files

S3 CSV Files

S3 JSON Files

Hybrid S3 + Local Queries

Export S3 Data to Local Database

Join S3 and Local Data

Aggregate S3 Data Locally

Performance Considerations

Minimize Data Transfer

Example: Optimized S3 Query

Connection Configuration

Credential Chain Provider

Real-World Examples

Analytics on S3 Data Lake

Multi-Region S3 Databases

Troubleshooting

Authentication Errors

Database Not Found

Permission Denied

Write Operation Blocked

Region Mismatch