cudaz provides two complementary GPU buffer types.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/akhildevelops/cudaz/llms.txt
Use this file to discover all available pages before exploring further.
CudaSlice(T) is a generic, comptime-typed buffer that gives you full type safety and lets the Zig compiler verify element types at build time. CudaSliceR is its runtime-typed counterpart that carries a DType tag instead of a comptime type parameter, making it useful for dynamic dispatch and generic pipelines where the element type is not known until runtime. Both types are thin wrappers around a CUdeviceptr and expose clone and free methods.
CudaSlice(T)
CudaSlice(T) is a comptime-generic struct parameterized by the element type T. It is the return type of Device.alloc, Device.allocZeros, Device.htodCopy, and Rng.genrandom.
Fields
| Field | Type | Description |
|---|---|---|
device_ptr | cuda.CUdeviceptr | Opaque handle to the GPU memory region |
len | usize | Number of elements (not bytes) in the buffer |
device | Device | The Device on which this memory was allocated |
Comptime Constants
| Constant | Type | Description |
|---|---|---|
element_type | type | Comptime constant equal to T; accessible as CudaSlice(T).element_type via pub const element_type: type = T |
clone
cuMemcpyDtoD_v2, allocating a new CudaSlice(T) of the same length on the same device. The original slice is not freed. The caller owns the returned slice and must call free() on it.
Returns: CudaError.Error!CudaSlice(T)
free
cuMemFree_v2. Panics if the underlying CUDA call fails. Always call free (or pair it with defer) when you are done with a slice.
Example
CudaSliceR
CudaSliceR is the runtime-typed GPU slice. Instead of a comptime T, it stores a DType value in its element_type field. This makes it suitable for scenarios where you do not know the element type at compile time, or where you want a single variable that can hold slices of different types.
Fields
| Field | Type | Description |
|---|---|---|
device_ptr | cuda.CUdeviceptr | Opaque handle to the GPU memory region |
len | usize | Number of elements (not bytes) in the buffer |
device | Device | The Device on which this memory was allocated |
element_type | DType | Runtime element type tag (.f16 or .f32) |
clone
cuMemcpyDtoD_v2. The element_type field is preserved in the returned slice. The byte count is computed as element_type.size() * len.
Returns: CudaError.Error!CudaSliceR
free
cuMemFree_v2. Panics on error.
Example
To bring data from a
CudaSliceR back to the host, use Device.syncReclaimR. You must supply the concrete comptime type at the call site — it must match slice.element_type.DType
DType is an enum that represents a GPU element type at runtime. It is used by CudaSliceR, Device.allocR, and Device.allocZerosR.
Variants
| Variant | Backing value | Element size |
|---|---|---|
DType.f16 | 0 | 2 bytes (@sizeOf(f16)) |
DType.f32 | 1 | 4 bytes (@sizeOf(f32)) |
size
DType.
The runtime element type whose size to query.
usize — 2 for .f16, 4 for .f32