Vortex provides a Spark DataSource V2 connector for reading and writing Vortex files. The connector is published to Maven Central asDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/vortex-data/vortex/llms.txt
Use this file to discover all available pages before exploring further.
dev.vortex:vortex-spark and is built against Spark 4.x with Scala 2.13.
Installation
The connector ships as a shadow JAR that relocates its Arrow, Guava, and Protobuf dependencies to avoid classpath conflicts with Spark.
Reading Vortex files
Use thevortex format to read a single file or a directory of Vortex files:
.vortex files and creates one read partition per file. Column pruning is pushed down — only the columns referenced by the query are read from the file.
Reading from S3
Writing Vortex files
part-{partitionId}-{taskId}.vortex.
Write options
| Option | Default | Description |
|---|---|---|
vortex.write.batch.size | 2048 | Number of rows per batch (1–65536) |
Save modes
The connector supports all standard Spark save modes:Overwrite, Append, Ignore, and ErrorIfExists.
Supported types
| Spark type | Vortex type |
|---|---|
BooleanType | Bool |
ByteType | Int8 / UInt8 |
ShortType | Int16 / UInt16 |
IntegerType | Int32 / UInt32 |
LongType | Int64 / UInt64 |
FloatType | Float32 |
DoubleType | Float64 |
StringType | Utf8 |
BinaryType | Binary |
DecimalType | Decimal |
DateType | Date (days) |
TimestampType | Timestamp (microseconds, UTC) |
TimestampNTZType | Timestamp (microseconds, no timezone) |
ArrayType | List |
StructType | Struct |