Prerequisites
Install datatable before you begin:Create a Frame
A You can also create a Frame from a numpy array, a pandas DataFrame, or with explicit column types:
Frame is the fundamental unit of analysis in datatable — a two-dimensional table of rows and columns, similar to a pandas DataFrame or SQL table.Create one from a Python dictionary:Load a CSV file with fread()
fread() reads CSV, text, Excel, and other formats. It automatically detects separators, headers, column types, and quoting rules. It also handles URLs, shell output, .zip archives, and glob patterns.Select rows and columns
datatable uses You can also update or delete subsets:
DT[i, j] notation for all data access — the same indexing used in mathematics, C/C++, R, and numpy.iis the row selectorjis the column selector
Filter rows with f-expressions
f is a “frame proxy” — a variable you import from datatable that lets you reference columns by name in expressions. It becomes a reference to the current Frame wherever it is used.f refers to the current Frame. When joining two frames, g refers to the joined (second) frame.Group and aggregate
The You can combine
by() modifier splits a Frame into groups before applying the column expression. This affects aggregation functions like sum(), mean(), min(), and sd().by() with sort() to order results within each group:Next steps
Core concepts: Frame
Understand the Frame object — its structure, types, and properties.
Core concepts: f-expressions
Learn the full power of f-expressions for filtering, transforming, and aggregating data.
Selecting and filtering
Deep dive into row and column selection with the
DT[i, j] syntax.Reading and writing data
Explore all input and output options including fread(), CSV, JAY, and more.