f-expressions

The datatable module exports a special symbol f that represents columns of the frame currently being operated on. You use f inside DT[i, j, by(), ...] calls to refer to columns by name, index, slice, or type — and to compose arithmetic or comparison expressions over them.

import datatable as dt
from datatable import f

The f symbol

By itself, f.price means a column named “price” in an unspecified frame. The expression becomes concrete when used inside a frame operation:

train_dt[f.price > 0, :]

Here f refers to train_dt. The expression filters all rows where price is positive. Because f-expressions are frame-agnostic until evaluated, you can save them and reuse them across frames:

price_filter = f.price > 0

train_filtered = train_dt[price_filter, :]
test_filtered  = test_dt[price_filter, :]

Single-column selectors

Reference a column by attribute name, string key, or integer index:

f.price         # column named "price"
f["price"]      # same, using string key
f["Price ($)"]  # column names with spaces or special characters
f[3]            # column at index 3 (0-based)
f[-1]           # last column

Integer indices follow standard Python list semantics: negative indices count from the end, and out-of-range indices raise an error. The bracket form is also useful when the column name is computed at runtime:

# frame has columns "2017_01", "2017_02", ..., "2019_12"
cols = [f["%d_%02d" % (year, month)]
        for month in range(1, 13)
        for year in [2017, 2018, 2019]]

Multi-column selectors

When you pass a slice or a type to f[...], you get a columnset — a selection of zero or more columns:

f[:]          # all columns
f[::-1]       # all columns in reverse order
f[:5]         # first 5 columns
f[3:4]        # fourth column (slice, not a single-column selector)
f["B":"H"]    # columns from B to H, inclusive
f["C9":"C1"]  # columns C9, C8, ..., C1 (reversed name range)
f[:"C3"]      # all columns up to C3
f["C5":]      # all columns starting from C5
f[int]        # all integer columns
f[float]      # all float columns
f[dt.str32]   # all columns with stype str32
f[None]       # no columns (empty columnset)

A columnset can appear anywhere a sequence of columns is expected — in j, inside by() or sort(), or with functions like rowsum(), rowmean(), rowmin():

DT[:, sum(f[:])]         # sum of every column
DT[:, f[:3] + f[-3:]]    # pairwise sum of first 3 and last 3 columns

f[9] raises an error if the frame has fewer than 10 columns. f[9:10] returns an empty columnset instead. This is consistent with Python’s slicing semantics.

Modifying columnsets

Use .extend() to add columns and .remove() to subtract them:

f[int].extend(f[float])          # all integer and float columns
f[:3].extend(f[-3:])             # first 3 and last 3 columns
f[:].remove(f[str])              # all columns except strings
f[:10].remove(f.A)               # first 10 columns without column "A"

# extend with a computed column
f[:].extend({"cost": f.price * f.quantity})

Removing a column that is not in the columnset is safe — missing columns are silently ignored. You cannot remove a transformed (computed) column.

Arithmetic and comparison expressions

f-expressions support standard arithmetic operators and comparisons. These compose into new expressions:

f.A + f.B          # sum of two columns
f.price * f.qty    # product
f.A - f.B
f.A / f.B

f.price > 0                          # boolean filter
f.score >= 0.5
(f.A > 10) & (f.B < 5)              # logical AND
(f.A > 10) | (f.B < 5)              # logical OR

Use these in the i row selector:

DT[f.price > 0, :]
DT[(f.score >= 0.5) & (f.label == "good"), :]

Use them in the j column selector to compute new columns:

DT[:, {"A": f.A, "B": f.B, "A+B": f.A + f.B, "A-B": f.A - f.B}]

Combine with aggregation functions for more complex selections:

from datatable import f, mean, sd

DT[(f.A > mean(f.B) + 2.5 * sd(f.B)) | (f.A < -mean(f.B) - sd(f.B)), :]

Normalize a column to [0, 1]:

from datatable import f, min, max

DT[:, (f.A - min(f.A)) / (max(f.A) - min(f.A))]

The g symbol

The module also exports g, a second frame proxy used when joining frames. Inside a join() expression, g refers to columns of the joined frame while f refers to the primary frame:

from datatable import f, g, join, sum

DT[:, sum(f.quantity * g.price), join(products)]

See the quick-start guide for full join examples.

DT.export_names()

The .export_names() helper returns a tuple of f-expressions, one per column, named after each column. This lets you omit the f. prefix when writing complex expressions:

Id, Price, Quantity = DT.export_names()

DT[:, [Id, Price, Quantity, Price * Quantity]]

This is equivalent to:

DT[:, [f.Id, f.Price, f.Quantity, f.Price * f.Quantity]]

Get Started

Core Concepts

Working with Data

Machine Learning

Migration & Comparisons

The f symbol

Single-column selectors

Multi-column selectors

Modifying columnsets

Arithmetic and comparison expressions

The g symbol

DT.export_names()

Build docs developers (and LLMs) love

Get Started

Core Concepts

Working with Data

Machine Learning

Migration & Comparisons

​The f symbol

​Single-column selectors

​Multi-column selectors

​Modifying columnsets

​Arithmetic and comparison expressions

​The g symbol

​DT.export_names()

Build docs developers (and LLMs) love

The f symbol

Single-column selectors

Multi-column selectors

Modifying columnsets

Arithmetic and comparison expressions

The g symbol

DT.export_names()