Skip to main content

What is the Delta Sharing Reference Server?

The Delta Sharing Reference Server is a reference implementation of the Delta Sharing Protocol that enables you to share Delta Lake and Apache Parquet tables stored on cloud object storage (S3, Azure Blob Storage, ADLS Gen2, GCS, Cloudflare R2) with consumers. This lightweight server provides a REST API that clients can use to discover and access shared data without copying it. The server generates pre-signed URLs that allow clients to read data files directly from your cloud storage, ensuring efficient data transfer and minimal overhead.

When to Use the Reference Server

The reference server is ideal for:

Development & Testing

Testing Delta Sharing connectors and integrations during development

Small-Scale Sharing

Sharing datasets with a limited number of trusted recipients

Proof of Concepts

Quickly demonstrating Delta Sharing capabilities

Self-Hosted Solutions

Organizations that need full control over their sharing infrastructure
The reference server is not a production-ready, secure web server. It’s designed for development and small-scale use cases. For production deployments, we strongly recommend placing it behind a secure proxy (like NGINX) with proper authentication and authorization.

When to Use Managed Alternatives

For production workloads with enterprise requirements, consider managed Delta Sharing services:

Databricks Delta Sharing

Databricks provides a fully managed Delta Sharing service with:
  • Enterprise security: Built-in authentication, authorization, and audit logging
  • Scalability: Handles high-volume data sharing across thousands of recipients
  • Unity Catalog integration: Centralized governance and access control
  • No infrastructure management: Fully managed service with high availability
  • Fine-grained permissions: Row-level and column-level security
  • Monitoring and analytics: Track usage and access patterns

Other Vendors

Several vendors offer Delta Sharing capabilities. Check the community connectors page for a list of available providers.

Architecture Overview

The reference server operates as a stateless REST API server:
1

Client Request

A Delta Sharing client (Python, Spark, etc.) sends a request to the server for table metadata or data
2

Server Processing

The server authenticates the request, reads Delta Lake metadata from cloud storage, and applies any filters
3

Pre-signed URLs

The server generates pre-signed URLs for the data files and returns them to the client
4

Direct Download

The client uses the pre-signed URLs to download data files directly from cloud storage

Key Features

Share tables stored on AWS S3, Azure Blob Storage, Azure Data Lake Storage Gen2, Google Cloud Storage, and Cloudflare R2
Share both Delta Lake tables (with full history and metadata) and Apache Parquet tables
Enable recipients to query incremental changes to shared tables when CDF is enabled
Support for filtering data at the source to minimize data transfer
Simple token-based authentication for securing API access
Pre-built Docker images available for easy deployment

Limitations

Understanding the reference server’s limitations is crucial for determining if it’s the right choice for your use case.
  • Basic Authentication: Only supports bearer token authentication out of the box
  • No User Management: No built-in user management or fine-grained permissions
  • No Audit Logging: Limited logging capabilities for compliance requirements
  • Manual Configuration: Tables must be manually configured in YAML files
  • No High Availability: Single-server deployment with no clustering support
  • Security: Requires additional infrastructure (proxy, JWT auth) for production security

Next Steps

Install the Server

Download and set up the reference server

Configure the Server

Learn how to configure shares, schemas, and tables

Cloud Storage Setup

Configure authentication for your cloud storage provider

Secure Your Server

Set up authentication and authorization

Build docs developers (and LLMs) love