Skip to main content
The Selection class represents a subset of atoms in a molecular system, created through the selection language or explicit indices.

Overview

Selections are created via System.select() or System.select_indices() and are used throughout warp-md to specify which atoms to include in analyses.
from warp_md import System

system = System.from_pdb("protein.pdb")

# From selection expression
backbone = system.select("backbone")
ca_atoms = system.select("name CA")

# From explicit indices
custom = system.select_indices([0, 5, 10, 15])

Properties

indices

selection.indices -> list[int]
Get the list of atom indices in the selection.
indices
list[int]
Zero-based atom indices that match the selection criteria
backbone = system.select("backbone")
print(f"Selected {len(backbone.indices)} atoms")
print(f"First 5 indices: {backbone.indices[:5]}")

# Use in analysis
from warp_md import RgPlan
plan = RgPlan(backbone, mass_weighted=True)

Selection Language

The selection language provides a powerful way to specify atoms based on their properties.

Keywords

name

Select atoms by atom name.
# Single atom name
ca = system.select("name CA")
nitrogen = system.select("name N")

# Wildcards supported in atom names
carbon = system.select("name C*")

resname

Select atoms by residue name.
# Water molecules
water = system.select("resname SOL")
water_tip3p = system.select("resname TIP3")

# Specific amino acid
alanine = system.select("resname ALA")

# Ions
sodium = system.select("resname NA")

resid

Select atoms by residue ID or range.
# Single residue
res_42 = system.select("resid 42")

# Range of residues
active_site = system.select("resid 100:120")
binding_pocket = system.select("resid 50:75")

# Reverse ranges work too
same_range = system.select("resid 120:100")

chain

Select atoms by chain identifier.
# Single chain
chain_a = system.select("chain A")
chain_b = system.select("chain B")

# Combine with other selectors
ca_chain_a = system.select("name CA and chain A")

protein

Select all protein atoms (standard amino acid residues).
# All protein atoms
protein = system.select("protein")

# Protein in specific chain
protein_a = system.select("protein and chain A")

# Everything except protein
not_protein = system.select("not protein")
Recognized protein residues: ALA, ARG, ASN, ASP, CYS, GLN, GLU, GLY, HIS, ILE, LEU, LYS, MET, PHE, PRO, SER, THR, TRP, TYR, VAL, MSE, HSD, HSE, HSP

backbone

Select protein backbone atoms (N, CA, C, O, OXT).
# All backbone atoms
backbone = system.select("backbone")

# Backbone in specific region
backbone_100_120 = system.select("backbone and resid 100:120")

# Backbone in chain A
backbone_a = system.select("backbone and chain A")

Operators

and

Logical AND - atoms must match both conditions.
# CA atoms in chain A
ca_chain_a = system.select("name CA and chain A")

# Protein backbone atoms in specific residues
backbone_region = system.select("backbone and resid 50:100")

# Multiple conditions
specific = system.select("name CA and chain A and resid 1:50")

or

Logical OR - atoms match if they satisfy either condition.
# CA or N atoms
ca_or_n = system.select("name CA or name N")

# Two different residue types
ala_or_gly = system.select("resname ALA or resname GLY")

# Multiple chains
chains_ab = system.select("chain A or chain B")

not

Logical NOT - inverts the selection.
# Everything except water
no_water = system.select("not resname SOL")

# Non-backbone protein atoms (sidechains)
sidechains = system.select("protein and not backbone")

# Complex negation
not_chain_a_ca = system.select("not (chain A and name CA)")

( )

Parentheses for grouping and controlling operator precedence.
# Group conditions
complex = system.select("(name CA or name N) and chain A")

# Precedence control
selection = system.select("(protein or resname SOL) and not chain B")

# Nested grouping
nested = system.select("((resid 1:50 or resid 100:150) and backbone) or name CA")

Complex Selection Examples

# Protein backbone excluding chain B
backbone_no_b = system.select("backbone and not chain B")

# All CA atoms in proteins
ca_protein = system.select("protein and name CA")

# Sidechains in active site
sidechain_active = system.select(
    "protein and not backbone and resid 100:120"
)

Usage in Analysis Plans

Selections are passed to analysis plans to specify which atoms to analyze.
from warp_md import System, Trajectory, RmsdPlan

system = System.from_pdb("protein.pdb")
traj = Trajectory.open_xtc("trajectory.xtc", system)

# Backbone RMSD
backbone = system.select("backbone")
plan = RmsdPlan(backbone, reference="topology", align=True)
rmsd = plan.run(traj, system, device="auto")

print(f"Mean RMSD: {rmsd.mean():.2f} Å")

Performance Notes

Selections are cached internally by the System object. Repeated calls to system.select() with the same expression return instantly without re-parsing or re-evaluating.
# First call: parses and evaluates
backbone1 = system.select("backbone")  # ~1ms

# Subsequent calls: instant return from cache
backbone2 = system.select("backbone")  # ~0.001ms
backbone3 = system.select("backbone")  # ~0.001ms

# Different expression: new evaluation
ca = system.select("name CA")  # ~1ms

Validation

Invalid selection expressions raise informative errors:
# Unknown keyword
try:
    sel = system.select("unknown_keyword value")
except Exception as e:
    print(e)  # "unknown predicate 'unknown_keyword'"

# Syntax error
try:
    sel = system.select("name CA and")
except Exception as e:
    print(e)  # "unexpected tokens at end of selection"

# Invalid operator
try:
    sel = system.select("name CA & chain A")
except Exception as e:
    print(e)  # "unexpected character '&'"

Build docs developers (and LLMs) love