Quickstart

This guide will get you reading shared data in minutes using the Delta Sharing Python connector.

Prerequisites

Python 3.8+ (Python 3.6+ for delta-sharing versions < 1.1)
pip (latest version recommended)
Linux users: glibc version >= 2.31 for automatic installation

If you’re using Databricks Runtime, follow the Databricks Libraries documentation to install on your clusters.

Install via pip

Install the Delta Sharing Python connector:

pip install delta-sharing

Troubleshooting Installation

If installation fails due to delta-kernel-rust-sharing-wrapper:

Check Python version: python --version (must be 3.8+)
Upgrade pip: pip install --upgrade pip
Check glibc version (Linux): ldd --version (must be 2.31+)
Install Rust if needed: Follow the Rust installation guide

Alternatively, use an older version without Rust dependency:

pip install delta-sharing==1.0.5

Verify Installation

Verify the installation:

import delta_sharing
print(delta_sharing.__version__)

Step 2: Get a Profile File

A profile file is a JSON file containing credentials to access a Delta Sharing Server.

Use Example Server
From Your Provider
Run Your Own Server

Download the open example profile to try Delta Sharing immediately:

curl -O https://databricks-datasets-oregon.s3-us-west-2.amazonaws.com/delta-sharing/share/open-datasets.share

This profile provides access to public COVID-19 and other sample datasets.

If your data provider has shared data with you, they will provide a .share profile file.Save it to your local filesystem or cloud storage.

Profile File Format

Profile files are JSON with the following structure:

{
  "shareCredentialsVersion": 1,
  "endpoint": "https://sharing.delta.io/delta-sharing/",
  "bearerToken": "your-bearer-token-here"
}

endpoint: The Delta Sharing Server URL
bearerToken: Authentication token for accessing shared data
shareCredentialsVersion: Protocol version (currently 1)

Step 3: Load Data with Python

Now you can read shared tables using pandas or Apache Spark.

import delta_sharing

# Point to the profile file
profile_file = "open-datasets.share"

# Create a SharingClient
client = delta_sharing.SharingClient(profile_file)

# List all shared tables
tables = client.list_all_tables()
print(tables)

# Create a table URL
# Format: <profile-path>#<share>.<schema>.<table>
table_url = profile_file + "#delta_sharing.default.owid-covid-data"

# Load first 10 rows as pandas DataFrame
df = delta_sharing.load_as_pandas(table_url, limit=10)
print(df)

Exploring Available Data

Use the SharingClient to discover what data is available:

import delta_sharing

client = delta_sharing.SharingClient("open-datasets.share")

# List all shares
shares = client.list_shares()
for share in shares:
    print(f"Share: {share.name}")

# List schemas in a share
schemas = client.list_schemas("delta_sharing")
for schema in schemas:
    print(f"Schema: {schema.name}")

# List tables in a schema
tables = client.list_tables("delta_sharing", "default")
for table in tables:
    print(f"Table: {table.name}")

# List all tables across all schemas
all_tables = client.list_all_tables()
for table in all_tables:
    print(f"{table.share}.{table.schema}.{table.name}")

Advanced Features

Change Data Feed (CDF)

If the shared table supports history (cdfEnabled=true), you can query table changes:

import delta_sharing

table_url = "profile.share#share.schema.table"

# Load changes between versions
changes = delta_sharing.load_table_changes_as_pandas(
    table_url,
    starting_version=0,
    ending_version=5
)

print(changes)

Streaming with Spark

Delta Sharing tables can be used as streaming sources:

val tablePath = "profile.share#share.schema.table"

val stream = spark.readStream
  .format("deltaSharing")
  .option("startingVersion", "1")
  .option("skipChangeCommits", "true")
  .load(tablePath)

stream.writeStream
  .format("console")
  .start()
  .awaitTermination()

Trigger.AvailableNow is not supported in Delta Sharing streaming as it requires Spark 3.3.0+, while Delta Sharing uses Spark 3.1.1.

Profile File Paths

Profile files can be stored in various locations:

Local File System
Cloud Storage (Python)
Cloud Storage (Spark)
Databricks

profile_file = "/path/to/profile.share"

Python connector supports any URL via fsspec:

# S3
profile_file = "s3a://my-bucket/config/profile.share"

# Azure Blob Storage
profile_file = "abfs://[email protected]/profile.share"

# Google Cloud Storage
profile_file = "gs://my-bucket/config/profile.share"

Spark connector supports Hadoop FileSystem URLs:

# S3
profile_file = "s3a://my-bucket/config/profile.share"

# HDFS
profile_file = "hdfs://namenode:8020/user/config/profile.share"

On Databricks, use DBFS paths:

profile_file = "/dbfs/mnt/config/profile.share"

Complete Example

Here’s a complete example analyzing COVID-19 data:

import delta_sharing
import pandas as pd

# Download and use the example profile
profile_file = "open-datasets.share"

# Create client
client = delta_sharing.SharingClient(profile_file)

# List available tables
print("Available tables:")
for table in client.list_all_tables():
    print(f"  - {table.share}.{table.schema}.{table.name}")

# Load COVID-19 data
table_url = profile_file + "#delta_sharing.default.owid-covid-data"
df = delta_sharing.load_as_pandas(table_url)

print(f"\nLoaded {len(df)} rows")
print(f"Columns: {', '.join(df.columns)}")

# Analyze USA data
usa = df[df["iso_code"] == "USA"].copy()
usa["date"] = pd.to_datetime(usa["date"])
usa = usa.sort_values("date")

print("\nUSA COVID-19 Statistics (Latest):")
latest = usa.iloc[-1]
print(f"  Date: {latest['date']}")
print(f"  Total Cases: {latest['total_cases']:,.0f}")
print(f"  Total Deaths: {latest['total_deaths']:,.0f}")

Next Steps

Now that you’ve successfully loaded shared data, explore more:

Python API Reference

Explore the full Python connector API

Spark Connector

Learn about the Apache Spark connector

Set Up a Server

Share your own Delta Lake tables

Protocol Details

Deep dive into the Delta Sharing Protocol

Join the Delta Lake Slack community for help and to connect with other Delta Sharing users!

Get Started

Core Concepts

Python Connector

Apache Spark Connector

Reference Server

Community Connectors

Prerequisites

Step 2: Get a Profile File

Step 3: Load Data with Python

Exploring Available Data

Advanced Features

Change Data Feed (CDF)

Streaming with Spark

Profile File Paths

Complete Example

Next Steps

Python API Reference

Spark Connector

Set Up a Server

Protocol Details

Build docs developers (and LLMs) love

Get Started

Core Concepts

Python Connector

Apache Spark Connector

Reference Server

Community Connectors

​Prerequisites

​Step 1: Install Delta Sharing

​Step 2: Get a Profile File

​Step 3: Load Data with Python

​Exploring Available Data

​Advanced Features

​Change Data Feed (CDF)

​Streaming with Spark

​Profile File Paths

​Complete Example

​Next Steps

Python API Reference

Spark Connector

Set Up a Server

Protocol Details

Build docs developers (and LLMs) love

Prerequisites

Step 1: Install Delta Sharing

Step 2: Get a Profile File

Step 3: Load Data with Python

Exploring Available Data

Advanced Features

Change Data Feed (CDF)

Streaming with Spark

Profile File Paths

Complete Example

Next Steps