Use this file to discover all available pages before exploring further.
The config.yaml file is the single source of truth for the BDB-Genomics ATAC-seq framework. Every tool, path, resource limit, and QC threshold is declared here; Snakemake rules are stateless wrappers that read from this config at runtime. YAML anchors (e.g., &GENOME_FA, *GENOME_FA) centralise reference file paths so that changing one entry propagates everywhere automatically. Every tool block follows a uniform schema - input → output → params → threads → resources - making it straightforward to add new stages without touching existing ones.
Raw FASTQ paths are not declared directly in config.yaml. They are resolved dynamically from the sample sheet defined at global.samples (a TSV file with columns sample, fastq_r1, fastq_r2, replicate, condition).
Controls pipeline-wide behaviour, modality selection, and all shared reference file paths.Purpose: Declares the analysis mode, sample sheet location, and YAML anchors for genome references used by every downstream tool.
Pipeline modality. Use "bulk" for standard bulk ATAC-seq or "scatac" for single-cell ATAC-seq. Can also be overridden at runtime with the ATAC_MODE environment variable.
Path to the reference genome FASTA file. Anchored as &GENOME_FA; referenced by bowtie2, picard, TOBIAS, chromVAR, footprinting, and peak annotation rules.
Purpose: Paired-end alignment of trimmed reads to the reference genome (bulk ATAC-seq mode).Input:results/preprocessing/fastp/Output:results/alignment/bowtie2/
Purpose: Fast single-cell ATAC-seq read alignment using the Chromap aligner (--preset atac). Only active when global.mode = "scatac".Input:results/preprocessing/fastp/Output:results/alignment/chromap/
Purpose: Calculates mitochondrial read fractions before deduplication for QC reporting.Input:results/post_alignment/samtools_sort/Output:results/post_alignment/mito-ATAC/
Purpose: Fills in mate-score tags required by samtools markdup.Input:results/post_alignment/samtools_sort/Output:results/post_alignment/samtools_fixmate/
Purpose: Indexes the deduplicated BAM so Picard and downstream rules can random-access it.Input:results/post_alignment/samtools_markdup/Output:results/post_alignment/samtools_index/post_markdup/
Purpose: Removes mitochondrial reads from the deduplicated BAM using exact chromosome matching.Input:results/post_alignment/samtools_markdup/Output:results/post_alignment/remove_mito_reads/
Purpose: Filters reads by MAPQ and SAM flags, retaining only high-quality properly paired alignments.Input:results/post_alignment/remove_mito_reads/Output:results/post_alignment/samtools_view/
Purpose: Indexes the blacklist-filtered BAM. The .bai index file is placed alongside the BAM in the same directory.Input / Output:results/post_alignment/samtools_view/
Purpose: Applies the Tn5 transposase insertion bias shift (+4 bp on the forward strand, −5 bp on the reverse strand) using alignmentSieve.Input:results/post_alignment/samtools_view/Output:results/post_alignment/tn5_shift/ (shifted BAM + index)
Purpose: Generates comprehensive alignment statistics used by the QC gate (mapping rate, duplicate rate).Input:results/post_alignment/remove_mito_reads/Output:results/post_alignment/samtools_stats/
Purpose: Calculates TSS enrichment score by computing normalised signal across ±2 kb windows around annotated transcription start sites.Input: Tn5-shifted BAM + index (results/post_alignment/tn5_shift/)Output:results/metrics_qc/tss_enrichment/Script:rules/scripts/tss_enrichment.R
Purpose: Collects insert-size distribution metrics used for fragment size analysis and nucleosome banding QC.Input:results/post_alignment/samtools_markdup/Output (metrics + histogram):results/metrics_qc/picard/CollectInsertSizeMetrics/
Purpose: Estimates library complexity and predicts yield at higher sequencing depths using a curve extrapolation model.Input:results/post_alignment/remove_mito_reads/Output:results/reporting_qc/preseq/
Purpose: Converts the Tn5-shifted BAM to a raw bedGraph coverage track.Input:results/post_alignment/tn5_shift/Output:results/visualization/bedtools_genomecov/
Purpose: Sorts the bedGraph file by chromosome and coordinate (required by bedGraphToBigWig).Input:results/visualization/bedtools_genomecov/Output:results/visualization/sorted_bedgraph_file/
Purpose: Converts the sorted bedGraph to a binary BigWig file for genome browser visualisation.Input:results/visualization/sorted_bedgraph_file/Output:results/visualization/bigwig/
Purpose: Generates CPM-normalised BigWig tracks for cross-sample comparability using bamCoverage.Input:results/post_alignment/tn5_shift/Output:results/visualization/normalized_coverage/
Purpose: Calls chromatin accessibility peaks from the Tn5-shifted BAM in paired-end BAM mode.Input:results/post_alignment/tn5_shift/Output:results/peak_calling/macs2_peakcall/
Purpose: Removes peaks overlapping ENCODE blacklist regions from the MACS2 narrowPeak output.Input:results/peak_calling/macs2_peakcall/Output:results/peak_calling/filtered_peaks/
Purpose: Annotates filtered peaks with genomic features (promoter, intron, exon, intergenic) using HOMER or ChIPseeker.Input:results/peak_calling/filtered_peaks/Output:results/peak_calling/peak_annotation/
Purpose: Performs de novo and known motif enrichment analysis in filtered peaks using HOMER.Input:results/peak_calling/filtered_peaks/Output:results/peak_calling/motif_analysis/
Purpose: Counts reads in consensus peak regions across all samples to build a count matrix for DESeq2 differential accessibility analysis.Input:results/post_alignment/tn5_shift/Output:results/peak_calling/count_peaks/
Purpose: Chromatin co-accessibility analysis using Cicero: identifies co-accessible regulatory elements and calls Cis-Co-Accessibility Networks (CCANs).Inputs:
Purpose: Aggregates per-rule Snakemake benchmark files into a single TSV summary of wall-clock time, CPU time, and peak memory usage across all pipeline stages.Output:results/reporting/benchmark_summary.tsv
multiqc
Purpose: Aggregates QC metrics from fastp, FastQC, Picard, samtools, preseq, Qualimap, and the QC gate into a single interactive HTML report.Output:results/reporting/multiqc/