FExpr is a class that encapsulates a deferred computation to be performed on a frame. Rather than producing data immediately, an FExpr records what to compute. The computation is resolved when the expression is used inside DT[i, j, ...].
FExpr objects are rarely constructed directly. They are produced by accessing columns through the f and g namespace symbols, and by combining those with operators or functions.
The f and g symbols
f refers to columns of the current frame (the left operand of DT[...]).
g refers to columns of the joined frame when performing a join operation.
Both are instances of dt.Namespace and support the same selection syntax.
Column selection
By name or index
By slice
Integer slices follow Python conventions: the stop value is not included. String slices are inclusive on both ends.
By type
Select all columns matching a given type:Multiple selectors
Construction
FExpr(e)
Create a new FExpr from e. This constructor is rarely called directly.
e
None | bool | int | str | float | slice | list | tuple | dict | type | stype | ltype | Generator | FExpr | Frame | range | pd.DataFrame | pd.Series | np.array
The value to wrap as an
FExpr.alias(*names)
Assign new names to the columns produced by this FExpr.
New names for the columns.
FExpr.
extend(arg)
Append another FExpr’s columns to this one, combining two column sets into a single expression. Similar to cbind().
The expression to append.
FExpr.
remove(arg)
Remove columns from this FExpr. The argument must reference columns by position, name, or type (not computed columns).
Columns to remove. Must be “by-reference” selections.
FExpr.
Arithmetic operators
All binary operators work when either or both operands areFExprs.
| Operator | Expression | Description |
|---|---|---|
+ | x + y | Addition |
- | x - y | Subtraction |
* | x * y | Multiplication |
/ | x / y | True division |
// | x // y | Integer (floor) division |
% | x % y | Modulus (remainder after integer division) |
** | x ** y | Exponentiation |
+x | +x | Unary plus |
-x | -x | Unary negation |
Bitwise operators
| Operator | Expression | Description |
|---|---|---|
& | x & y | Bitwise AND (also used as logical AND for boolean columns) |
| | x | y | Bitwise OR (also used as logical OR for boolean columns) |
^ | x ^ y | Bitwise XOR |
~ | ~x | Bitwise NOT (also logical NOT for booleans) |
<< | x << y | Left shift |
>> | x >> y | Right shift |
Comparison operators
| Operator | Expression | Description |
|---|---|---|
== | x == y | Equal |
!= | x != y | Not equal |
< | x < y | Less than |
<= | x <= y | Less than or equal |
> | x > y | Greater than |
>= | x >= y | Greater than or equal |
Aggregation methods
These methods onFExpr are equivalent to the corresponding top-level dt.* functions and can be applied per-group with by().
| Method | Description |
|---|---|
.sum() | Sum of values |
.min() | Minimum value |
.max() | Maximum value |
.mean() | Mean value |
.median() | Median value |
.sd() | Standard deviation |
.count() | Count of non-NA values |
.countna() | Count of NA values |
.nunique() | Number of unique values |
.prod() | Product of values |
.first() | First value |
.last() | Last value |
Row-wise methods
These methods operate across columns row-by-row, returning a single column.| Method | Description |
|---|---|
.rowsum() | Row-wise sum across columns |
.rowmin() | Row-wise minimum |
.rowmax() | Row-wise maximum |
.rowmean() | Row-wise mean |
.rowsd() | Row-wise standard deviation |
.rowcount() | Count of non-NA values per row |
.rowfirst() | First non-NA value per row |
.rowlast() | Last non-NA value per row |
.rowargmin() | Column index of the minimum per row |
.rowargmax() | Column index of the maximum per row |
.rowall() | True if all values in the row are True |
.rowany() | True if any value in the row is True |
Cumulative methods
| Method | Description |
|---|---|
.cumsum() | Cumulative sum |
.cumprod() | Cumulative product |
.cummin() | Cumulative minimum |
.cummax() | Cumulative maximum |
String methods
len()
Length of each string in a string column.
re_match(pattern)
Check whether each string in a string column matches the given regex pattern. Returns a boolean column.
A regular expression pattern.
FExpr[slice]
Apply a string slice to each element of a string column.
Python slice applied to every string value in the column.
FExpr[str].
Type casting
as_type(new_type)
Cast the columns in this expression to a new type. Equivalent to dt.as_type(expr, new_type).
The g symbol for joins
g refers to columns of the joined (right) frame in a join operation, in the same way that f refers to the left frame.
g uses the same selector syntax as f. See the column selection section above.Miscellaneous
fillna(value)
Replace NA values in the expression with value.
shift(n=1)
Shift the column values by n rows (forward for positive, backward for negative).