Vortex performs all file I/O asynchronously. Rather than committing to a single async runtime, it defines its own runtime abstraction inDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/vortex-data/vortex/llms.txt
Use this file to discover all available pages before exploring further.
vortex-io. This lets different integrations — DataFusion with Tokio, DuckDB with its thread-per-core model — use the threading strategy that best fits their host engine.
Runtime Abstraction
The central type isHandle: a cloneable weak reference to an active runtime. All async work in Vortex — I/O, compute, background tasks — is spawned through a handle. The handle is stored in the session via RuntimeSession and threaded through the API alongside other session state.
Internally, a Handle wraps a Weak<dyn Executor>. The Executor trait defines three spawn methods:
Send + 'static. Handle::find() will auto-detect a running Tokio context if the tokio feature flag is enabled — useful when Vortex is initialized inside an existing async application.
Callers spawn work through the Handle methods:
Task<T> is a Future that resolves to the task’s output. Dropping a Task cancels it where possible. Call .detach() to let it run in the background.
Tokio Integration
TheTokioRuntime adapter wraps a tokio::runtime::Handle and delegates all spawning to Tokio’s thread pool. It is the right choice for applications that already run inside a Tokio context, such as DataFusion or Axum servers.
with_tokio() is called, the adapter captures the current Tokio runtime handle. If no Tokio context is active at that point, it panics. For applications that do not already use Tokio, the CurrentThreadRuntime described below is preferred.
CurrentThreadRuntime (smol)
TheCurrentThreadRuntime (CRT) is built on smol and provides a more flexible threading model. Unlike Tokio, the CRT does not spawn background threads by default — it relies on the calling thread to drive the executor by calling block_on.
This design fits thread-per-core engines like DuckDB: when DuckDB calls into a Vortex scan on one of its worker threads, that thread blocks on a future and drives the entire smol executor for the duration of the call. No separate I/O thread pool is required, and the engine retains full control over its threading model.
Worker Pool
For workloads that need background I/O progress while the calling thread is busy, the CRT can be paired with aCurrentThreadWorkerPool:
block_on(executor.run(...)) in a loop. Workers can be scaled up and down dynamically at runtime. When the count is reduced, excess workers are signalled to shut down gracefully.
The CRT has a known pitfall: a thread busy evaluating a CPU-bound kernel is not polling the executor, which can stall in-flight I/O requests. Spawning a worker pool mitigates this, but adds threads and coordination overhead. This is an active area of design work.
VortexReadAt: Unified I/O Interface
TheVortexReadAt trait is the unified interface for positional reads across all storage backends:
ByteBuffer— in-memory reads, no I/O.std_file/FileReadAdapter— local disk reads dispatched viaspawn_blockingto avoid blocking the async executor.ObjectStoreSource— object storage reads via theobject_storecrate (S3, GCS, Azure Blob, and more). Natively async, wrapped withasync_compatfor runtime compatibility.
Read Coalescing
When reading columnar segments, many small reads target nearby offsets. The I/O system merges them into fewer, larger reads usingCoalesceConfig:
| Backend | Concurrency | Coalesce Distance | Coalesce Max Size |
|---|---|---|---|
| In-memory | 16 | 8 KB | 8 KB |
| Local file | 32 | 1 MB | 4 MB |
| Object store | 192 | 1 MB | 16 MB |
Object Storage Support
Theobject_store feature gate enables the ObjectStoreSource adapter, which wraps any object_store::ObjectStore implementation:
WASM Support
Onwasm32-unknown-unknown targets, the standard file and thread-blocking APIs are unavailable. The wasm runtime module provides a browser-compatible executor that drives futures using JavaScript’s microtask queue. The std_file module is excluded on WASM targets via #[cfg(not(target_arch = "wasm32"))].
Instrumented Reads
InstrumentedReadAt<T> wraps any VortexReadAt implementation with metrics collection. It records read sizes (as a histogram), total bytes read (as a counter), and read durations (as a timer), and logs a summary when the wrapper is dropped:
vortex-metrics crate and can be wired into any compatible metrics backend.