The BDB-Genomics ATAC-seq pipeline is designed to scale down to workstations and laptops with limited RAM. Two complementary mechanisms control memory usage: theDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/BDB-Genomics/atacseq-pipeline/llms.txt
Use this file to discover all available pages before exploring further.
profile/low_resource profile caps the memory each individual rule can request, and rules/scripts/run_batched.py serialises sample processing so that only a small subset of samples is active in memory at any given time. Used together, these tools make it possible to run the full pipeline on a machine with as little as 4 GB of RAM, at the cost of longer elapsed wall time.
The low_resource Profile
The low-resource profile lives at profile/low_resource/config.yaml. It sets jobs: 2 so at most two rules run concurrently, applies explicit per-rule memory and thread caps via set-resources, and falls back to 2 GB and 1 thread for any rule not explicitly listed.
Profile Configuration
Running with the Low-Resource Profile
The
set-resources overrides in the low-resource profile take precedence over the (higher) values declared in config.yaml. This is intentional — the profile enforces a hard ceiling regardless of what each rule’s default resources request.Sequential Sample Batching with run_batched.py
Even with the low-resource profile, processing all samples simultaneously can cause out-of-memory (OOM) errors on machines with ≤4 GB RAM. rules/scripts/run_batched.py solves this by reading the sample sheet, splitting it into groups of --batch-size samples, and executing Snakemake sequentially for each group. Because Snakemake resumes automatically from completed outputs, results accumulate in results/ across batches without any duplication.
How It Works
Arguments
| Argument | Default | Description |
|---|---|---|
--batch-size | 1 | Number of samples processed per Snakemake invocation |
--cores | 2 | CPU cores allocated to each batch |
--memory | 4000 | Memory limit in MB passed via --resources mem_mb= |
--mode | from config | Pipeline mode: bulk or scatac |
--config | config.yaml | Path to the main config file |
--sample-sheet | data/fastp/samples.tsv | Path to the sample TSV |
--conda-frontend | mamba | Conda solver: mamba or conda |
--dry-run | flag | Print batch plan without executing |
Basic Usage
Dry Run — Preview the Batch Plan
Inspect how the sample sheet will be divided into batches before committing to a run:Combining the Low-Resource Profile with Batching
For machines with ≤4 GB of RAM, use the low-resource profile and the batching script together. The profile caps per-rule memory; the batching script prevents multiple high-memory rules from running for different samples simultaneously:--profile profile/low_resource and --resources mem_mb=4000 to each Snakemake invocation. You do not need to pass the profile flag separately.
Choosing the Right Configuration
≤4 GB RAM
Use
--batch-size 1 --cores 2 --memory 4000. One sample runs at a time. Expect significantly longer total run times.8 GB RAM, 4 cores
Use
--profile profile/low_resource with --batch-size 2 --cores 4. Two samples run concurrently within the per-rule memory caps.16 GB RAM workstation
Use
--profile profile/local directly. The default local profile (jobs: 8) handles up to 8 concurrent jobs without memory restrictions.Validating setup
Run the test profile first:
snakemake --profile profile/test. It applies relaxed QC thresholds designed for synthetic CI datasets that complete quickly on any hardware.Validating Your Setup Before a Full Run
Before committing to a multi-hour run on limited hardware, generate synthetic test data and execute a dry run to confirm the configuration is valid:Monitoring Progress on Low-Resource Machines
On machines without a job scheduler, watch Snakemake’s console output directly. Theprintshellcmds: true setting in the low-resource profile echoes every shell command as it runs. For longer runs, redirect output to a log file:
benchmarks/ after each rule completes and aggregated into results/reporting/benchmark_summary.tsv at the end of the run.