dt.Type— the current, unified type system (added in v1.0.0). Describes both the logical meaning and storage format of data.dt.stype— the legacy “storage type” enum. Deprecated since v1.0.0.dt.ltype— the legacy “logical type” enum. Deprecated since v1.0.0.
dt.Type.
dt.Type
Type describes the type of data stored in a single column. It encodes both the logical kind of data (integer, string, date, etc.) and its storage size in bits. Some types carry additional properties such as a parameterized element type (arrays, categoricals).
Type values
Primitive types
The type of a column in which every value is NA. Occupies no storage space. Convertible to any other type.
Boolean values, stored as 1 byte per element. NA values are supported. Treated as numeric (
True = 1, False = 0).8-bit signed integer. Range: −127 to 127. Corresponds to
int8_t in C.16-bit signed integer. Range: −32,767 to 32,767. Corresponds to
int16_t in C.32-bit signed integer. Range: −2,147,483,647 to 2,147,483,647. Corresponds to
int32_t in C. This is the default type when constructing a Frame from a Python list of integers.64-bit signed integer. Range: −9,223,372,036,854,775,807 to 9,223,372,036,854,775,807. Corresponds to
int64_t in C.32-bit IEEE 754 floating-point number. Corresponds to
float in C.64-bit IEEE 754 floating-point number. Corresponds to
double in C. This is the default type when constructing from Python float values.Variable-length UTF-8 strings, with 32-bit offsets. Can store up to 2 GB of character data per column. When the 2 GB limit is exceeded, the column is automatically promoted to
str64.Corresponds to str in Python, pa.string() in Arrow, and dtype('object') in numpy/pandas.Variable-length UTF-8 strings, with 64-bit offsets. No practical limit on column size.
Stores arbitrary Python objects as 64-bit pointers. Use sparingly; most operations are not supported on object columns.
Temporal types
Calendar date without a time component, stored as a 32-bit signed integer counting days since 1970-01-01. Accommodates dates in the range ±5.8 million years using the proleptic Gregorian calendar. Added in v1.0.0.Corresponds to
datetime.date in Python, pa.date32() in Arrow, and np.dtype('<M8[D]') in numpy.A specific moment in time, stored as a 64-bit integer counting nanoseconds since 1970-01-01 00:00:00 UTC. Not leap-second aware; not timezone-aware in the current version. Added in v1.0.0.Corresponds to
datetime.datetime in Python, pa.timestamp('ns') in Arrow, and dtype('datetime64[ns]') in numpy/pandas.Compound types
Array type with 32-bit offsets. Each element in the column is a list of values of type
T. Added in v1.1.0.Array type with 64-bit offsets. Each element is a list of values of type
T. Use for large arrays that exceed 32-bit indexing. Added in v1.1.0.Categorical type with 8-bit codes (up to 255 categories). Each element stores a value of type
T. Added in v1.1.0.Categorical type with 16-bit codes (up to 65,535 categories). Added in v1.1.0.
Categorical type with 32-bit codes. Added in v1.1.0.
Properties
The canonical string name of this type.
The smallest finite value representable by this type. Returns
None for types without a defined minimum (e.g., strings, objects).The largest finite value representable by this type.
True if this is a boolean type (bool8).True if this is an integer type (int8, int16, int32, int64).True if this is a float type (float32, float64).True for boolean, integer, and float types (bool8, int8, int16, int32, int64, float32, float64).True for string types (str32, str64).True for temporal types (date32, time64).True for the object type (obj64).True if this is the void type.True for array types (arr32, arr64).True for categorical types (cat8, cat16, cat32).True for compound types (arrays and categoricals).Using types when creating frames
Pass types to theFrame constructor using the types or type parameters:
Type conversion with dt.as_type()
Cast columns to a different type using dt.as_type() or the equivalent .as_type() method on an f-expression:
dt.stype (deprecated)
stype is an enum of “storage types” describing the physical representation of column data. Most values correspond directly to C primitive types.
Values
| Value | Bits | Description | Range |
|---|---|---|---|
stype.void | 0 | All-NA column | — |
stype.bool8 | 8 | Boolean | False / True |
stype.int8 | 8 | 8-bit signed integer | −127 to 127 |
stype.int16 | 16 | 16-bit signed integer | −32,767 to 32,767 |
stype.int32 | 32 | 32-bit signed integer | −2,147,483,647 to 2,147,483,647 |
stype.int64 | 64 | 64-bit signed integer | −9.2×10¹⁸ to 9.2×10¹⁸ |
stype.float32 | 32 | 32-bit IEEE float | ±3.4×10³⁸ |
stype.float64 | 64 | 64-bit IEEE float | ±1.8×10³⁰⁸ |
stype.str32 | var | Strings, 32-bit offsets (≤ 2 GB) | — |
stype.str64 | var | Strings, 64-bit offsets | — |
stype.obj64 | 64 | Arbitrary Python objects | — |
Properties
The
dt.ltype corresponding to this stype.The
ctypes C-level type for elements in a column of this stype. For variable-width types (strings), this returns the type of the fixed-width offset component.The
numpy.dtype corresponding to this stype. Requires numpy to be installed.The
struct module format string for this stype.The smallest finite value for numeric stypes.
None for non-numeric types.The largest finite value for numeric stypes.
None for non-numeric types.Lookup
stype(x) looks up an stype by value, name, Python type, or numpy dtype:
Casting
Call an stype as a function to cast a column expression:dt.ltype (deprecated)
ltype is an enum of “logical types” — the kind of data a column represents, independent of its physical storage format. Each ltype may correspond to multiple stypes.
Values
| Value | Description | Corresponding stypes |
|---|---|---|
ltype.bool | Boolean | stype.bool8 |
ltype.int | Integer | stype.int8, int16, int32, int64 |
ltype.real | Floating-point | stype.float32, float64 |
ltype.str | String | stype.str32, str64 |
ltype.time | Date / time | stype.date32, time64 |
ltype.obj | Python object | stype.obj64 |
Properties
The list of stypes that map to this ltype.
Lookup
ltype(x) resolves a value to its ltype: