Connecting to S3 Databases
DuckDB can directly query databases stored in S3, providing cloud-native analytics without downloading files.
Basic S3 Connection
{
"mcpServers" : {
"DuckDB (S3)" : {
"command" : "uvx" ,
"args" : [
"mcp-server-motherduck" ,
"--db-path" , "s3://your-bucket/path/to/database.duckdb"
],
"env" : {
"AWS_ACCESS_KEY_ID" : "your_access_key" ,
"AWS_SECRET_ACCESS_KEY" : "your_secret_key" ,
"AWS_DEFAULT_REGION" : "us-east-1"
}
}
}
}
S3 databases are automatically attached as read-only to prevent accidental modifications to cloud storage.
AWS Credential Configuration
Static Credentials
Provide AWS access keys directly:
{
"env" : {
"AWS_ACCESS_KEY_ID" : "AKIAIOSFODNN7EXAMPLE" ,
"AWS_SECRET_ACCESS_KEY" : "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" ,
"AWS_DEFAULT_REGION" : "us-east-1"
}
}
IAM Roles and Session Tokens
For temporary credentials from IAM roles, SSO, or EC2 instance profiles:
{
"env" : {
"AWS_SESSION_TOKEN" : "FwoGZXIvYXdzEBYaD..." ,
"AWS_DEFAULT_REGION" : "us-west-2"
}
}
When AWS_SESSION_TOKEN is present, DuckDB uses the credential_chain provider to automatically fetch credentials from:
Environment variables
AWS configuration files (~/.aws/credentials)
IAM instance profiles (EC2)
AWS SSO
IAM roles
Environment Variables Reference
Variable Required Description AWS_ACCESS_KEY_IDFor static credentials AWS access key AWS_SECRET_ACCESS_KEYFor static credentials AWS secret key AWS_SESSION_TOKENFor temporary credentials Session token for IAM roles AWS_DEFAULT_REGIONRecommended AWS region (defaults to us-east-1)
Querying S3 Databases
Basic Queries
{
"name" : "execute_query" ,
"arguments" : {
"sql" : "SELECT 1 as num"
}
}
List Tables in S3 Database
{
"name" : "list_tables" ,
"arguments" : {
"database" : "s3db"
}
}
Query S3 Database Tables
SELECT
product_id,
SUM (quantity) as total_quantity,
SUM (amount) as total_revenue
FROM s3db . main .sales
WHERE sale_date >= '2024-01-01'
GROUP BY product_id
ORDER BY total_revenue DESC
LIMIT 10
Read-Only Operations
S3 databases are attached with READ_ONLY flag to protect cloud data.
Allowed Operations
SELECT * FROM s3db . main .sales WHERE region = 'US' ;
Blocked Operations
Write operations will fail:
Attempted Write
Error Response
{
"name" : "execute_query" ,
"arguments" : {
"sql" : "CREATE TABLE s3db.main.new_table (id INT)"
}
}
INSERT, UPDATE, DELETE, and CREATE operations are not supported on S3 databases to prevent accidental modifications.
Standard S3 Path
s3://bucket-name/path/to/database.duckdb
Examples
Root of Bucket
Nested Path
With Region Prefix
s3://my-analytics-bucket/database.duckdb
Attaching Multiple S3 Databases
Query multiple S3 databases in a single session:
Switch Between S3 Databases
Requires --allow-switch-databases flag.
Initial Connection
Switch to Another S3 DB
{
"command" : "uvx" ,
"args" : [
"mcp-server-motherduck" ,
"--db-path" , "s3://bucket1/db1.duckdb" ,
"--allow-switch-databases"
]
}
Attach Additional S3 Databases
Without switching, attach additional databases:
ATTACH 's3://bucket2/analytics.duckdb' AS analytics_db ( READ_ONLY );
ATTACH 's3://bucket3/warehouse.duckdb' AS warehouse_db ( READ_ONLY );
Query Across Multiple S3 Databases
SELECT
'sales' as source,
COUNT ( * ) as record_count
FROM s3db . main .transactions
UNION ALL
SELECT
'analytics' as source,
COUNT ( * ) as record_count
FROM analytics_db . main .events
UNION ALL
SELECT
'warehouse' as source,
COUNT ( * ) as record_count
FROM warehouse_db . main .inventory
Besides DuckDB databases, query raw data files directly from S3:
S3 Parquet Files
SELECT * FROM read_parquet( 's3://my-bucket/data/*.parquet' )
WHERE date >= '2024-01-01'
LIMIT 1000 ;
S3 CSV Files
SELECT * FROM read_csv_auto( 's3://my-bucket/exports/data.csv' )
LIMIT 100 ;
S3 JSON Files
SELECT * FROM read_json_auto( 's3://my-bucket/logs/*.json' )
WHERE event_type = 'error'
ORDER BY timestamp DESC
LIMIT 50 ;
Querying raw S3 files (Parquet, CSV, JSON) doesn’t require the database to exist in S3 - DuckDB reads them directly with its httpfs extension.
Hybrid S3 + Local Queries
Combine S3 data with local processing:
Export S3 Data to Local Database
-- Create local copy of S3 data
CREATE TABLE local_sales AS
SELECT * FROM s3db . main .sales
WHERE sale_date >= '2024-01-01' ;
Join S3 and Local Data
-- Attach local database alongside S3
ATTACH '/local/path/customer_db.duckdb' AS local_db ( READ_ONLY );
-- Join across S3 and local
SELECT
s . order_id ,
s . amount ,
c . customer_name ,
c . segment
FROM s3db . main .sales s
JOIN local_db . main .customers c ON s . customer_id = c . id
WHERE s . sale_date >= CURRENT_DATE - INTERVAL 30 DAYS ;
Aggregate S3 Data Locally
-- Process large S3 dataset, store results locally
CREATE TABLE daily_summary AS
SELECT
DATE_TRUNC( 'day' , sale_timestamp) as sale_date,
region,
COUNT ( * ) as order_count,
SUM (amount) as total_revenue
FROM s3db . main .sales
GROUP BY sale_date, region
ORDER BY sale_date DESC ;
Minimize Data Transfer
Use WHERE Clauses Filter data in S3 before transferring to reduce network usage.
Aggregate in S3 Use GROUP BY and aggregations to summarize before downloading.
LIMIT Results Always use LIMIT for exploratory queries.
Column Projection Select only needed columns, not SELECT *.
Example: Optimized S3 Query
Bad (Transfers All Data)
Good (Filtered & Limited)
SELECT * FROM s3db . main .large_table;
Connection Configuration
The S3 connection is managed internally by the database client:
Create In-Memory Connection
When connecting to an S3 database, DuckDB creates an in-memory connection first.
Load httpfs Extension
The httpfs extension is installed and loaded for S3 support.
Configure Credentials
AWS credentials are configured using CREATE SECRET: CREATE SECRET IF NOT EXISTS s3_secret (
TYPE S3,
KEY_ID 'your_access_key' ,
SECRET 'your_secret_key' ,
REGION 'us-east-1'
);
Attach S3 Database
The S3 database is attached as read-only: ATTACH 's3://bucket/path/db.duckdb' AS s3db ( READ_ONLY );
USE s3db;
Credential Chain Provider
When AWS_SESSION_TOKEN is present, DuckDB uses credential chain:
CREATE SECRET IF NOT EXISTS s3_secret (
TYPE S3,
PROVIDER credential_chain,
REGION 'us-east-1'
);
This automatically fetches credentials from:
Environment variables
AWS configuration files
IAM instance profiles (EC2)
IAM roles (ECS, Lambda)
AWS SSO sessions
Real-World Examples
Analytics on S3 Data Lake
{
"mcpServers" : {
"S3 Data Lake" : {
"command" : "uvx" ,
"args" : [
"mcp-server-motherduck" ,
"--db-path" , "s3://company-datalake/analytics/production.duckdb" ,
"--max-rows" , "10000"
],
"env" : {
"AWS_SESSION_TOKEN" : "..." ,
"AWS_DEFAULT_REGION" : "us-west-2"
}
}
}
}
Multi-Region S3 Databases
{
"mcpServers" : {
"Multi-Region Analytics" : {
"command" : "uvx" ,
"args" : [
"mcp-server-motherduck" ,
"--db-path" , "s3://us-data/analytics.duckdb" ,
"--allow-switch-databases"
],
"env" : {
"AWS_ACCESS_KEY_ID" : "..." ,
"AWS_SECRET_ACCESS_KEY" : "..." ,
"AWS_DEFAULT_REGION" : "us-east-1"
}
}
}
}
Query multiple regions:
-- US data
SELECT 'US' as region, COUNT ( * ) FROM s3db . main .users;
-- Switch to EU database
-- Use switch_database_connection to s3://eu-data/analytics.duckdb
-- EU data
SELECT 'EU' as region, COUNT ( * ) FROM s3db . main .users;
Troubleshooting
Authentication Errors
Error : “Failed to attach S3 database: Unable to connect to S3”
Solution : Verify AWS credentials are set correctly:
# Test credentials
aws s3 ls s3://your-bucket/
# Check environment
echo $AWS_ACCESS_KEY_ID
echo $AWS_DEFAULT_REGION
Database Not Found
Error : “database does not exist”
Solution :
Verify the S3 path is correct
Check bucket permissions allow s3:GetObject
Ensure the file exists:
aws s3 ls s3://bucket/path/database.duckdb
Permission Denied
Error : “Access Denied” or “403 Forbidden”
Solution : Ensure your IAM user/role has required permissions:
{
"Version" : "2012-10-17" ,
"Statement" : [
{
"Effect" : "Allow" ,
"Action" : [
"s3:GetObject" ,
"s3:ListBucket"
],
"Resource" : [
"arn:aws:s3:::your-bucket/*" ,
"arn:aws:s3:::your-bucket"
]
}
]
}
Write Operation Blocked
Error : “Cannot execute statement of type INSERT/CREATE in read-only mode”
Solution : This is expected - S3 databases are read-only. To modify data:
Copy data to local database
Make modifications locally
Upload modified database back to S3 (outside of MCP)
Region Mismatch
Error : Slow queries or connection timeouts
Solution : Ensure AWS_DEFAULT_REGION matches your bucket’s region:
{
"env" : {
"AWS_DEFAULT_REGION" : "us-west-2" // Match bucket region
}
}