join() implements a left outer join: every row in the left frame is kept, and matching values from the right (keyed) frame are looked up and appended as additional columns.
How Joining Works
Key the right frame
Set one or more columns as the key on the frame you want to join from. The key column must contain unique values — it acts like a lookup table.
Join into the left frame
Use
DT[:, :, join(right_frame)] in the square-bracket selector. datatable matches rows in the left frame against the key in the right frame by column name.Basic Join
X1) must exist in the left frame under the same name:
df1 are preserved. The X3 column from df2 is appended, matched on X1.
Setting a Key
Assign a string (single key) or list of strings (composite key) toframe.key:
Referencing Joined Columns with g.
Use g.<column> inside j or update() to reference columns from the joined frame:
f.X2 refers to the left frame’s X2 column and g.X3 refers to the joined frame’s X3 column. The result is stored back in X2 in-place.
Selecting Specific Columns from the Join
You can pick exactly which columns to include from both frames:Filtering Using Joined Data
Apply a row filter that references columns from the joined frame:Multi-Column Keys
For composite keys, setframe.key to a list:
Notes and Limitations
datatable currently supports left outer joins only. Every row in the left frame is kept. Rows in the left frame with no match in the right frame will have
NA for the joined columns.- The join key column(s) must be present in the left frame with identical names.
- The right frame must be explicitly keyed before joining — passing an unkeyed frame raises
ValueError. g.is only valid inside expressions that also includejoin().