Vortex’s type system is built around a fixed set of primitiveDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/vortex-data/vortex/llms.txt
Use this file to discover all available pages before exploring further.
DType variants — booleans, integers, floats, strings, lists, structs, and so on. Extension DTypes let you layer a custom logical type on top of any physical storage dtype, without modifying the core type system. Built-in examples include vortex.uuid (backed by FixedSizeList(U8, 16)) and the datetime types (vortex.date, vortex.time, vortex.timestamp).
The extension DType model
An extension DType pairs a logical identity with a physical storage dtype. Concretely:- The
ExtIdis a unique namespaced string (e.g.,"vortex.uuid"). - The storage dtype is any valid
DType(e.g.,FixedSizeList(Primitive(U8), 16)). - The metadata is an optional byte payload attached to every instance of the type (e.g., a UUID version constraint).
Extension array encoding, which wraps any storage array and annotates it with the extension dtype.
The ExtVTable trait
To define a new extension type, implement ExtVTable from vortex_array::dtype::extension:
Step-by-step: defining a custom extension type
Define your VTable struct and metadata type
Create a zero-sized struct for the vtable and a metadata struct for per-instance data:If your type carries no metadata, use the provided
EmptyMetadata helper from vortex_array::extension.Choose a storage dtype
Pick any
DType to back your type’s physical storage. The storage dtype is validated by validate_dtype whenever an ExtDType<MyType> is constructed. For example, a MyType backed by a single u64:Arrow interoperability
Vortex extension DTypes map to Arrow canonical extension types when theExtId matches a registered Arrow extension name. The Arrow ExtensionType metadata round-trips through serialize_metadata / deserialize_metadata. Built-in types like vortex.uuid follow the Arrow canonical UUID extension specification exactly.
For custom types, ensure your ExtId string matches the Arrow extension name you register on the Arrow side if you need IPC interoperability.
Serialization considerations
Extension type metadata is stored in the Vortex file format as raw bytes alongside the dtype declaration. When reading a file, Vortex looks up the extension type by itsExtId in the session registry and calls deserialize_metadata to reconstruct the typed ExtDType.
This means:
- Metadata must be stable: changing the byte format of
serialize_metadatawill break existing files. - The
ExtIdmust be globally unique: use a reverse-domain namespace (myorg.mytype) to avoid collisions with built-in or third-party types. - Registration must happen before reading: unregistered extension types encountered in a file are deserialized as opaque
ForeignExtDTypevalues. Compute functions that depend on the typed vtable will not fire.
The UUID extension type as a reference
vortex-array/src/extension/uuid/ is the canonical example of a complete extension type implementation. It shows:
- Vtable struct (
Uuid) with optional-version metadata (UuidMetadata) - Serialization to/from a single byte
validate_dtypeenforcingFixedSizeList(U8, 16)storageunpack_nativeconverting the 16 storage bytes into auuid::Uuid- Round-trip tests using
rstestcase parameterization
Further reading
vortex-array/src/dtype/extension/vtable.rs— fullExtVTabletraitvortex-array/src/extension/uuid/— UUID reference implementationvortex-array/src/extension/datetime/— datetime extension typesvortex-array/src/extension/mod.rs—EmptyMetadatahelper- GitHub Discussions