Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/BDB-Genomics/AlphaGenomeR/llms.txt

Use this file to discover all available pages before exploring further.

Chromatin accessibility and protein binding assays reveal which parts of the genome are active and which regulatory factors are engaged at a given locus. AlphaGenomeR exposes four modalities in this category — ATAC-seq, DNase-seq, ChIP-seq for transcription factors, and ChIP-seq for histone marks — all requestable in a single alphagenome_query() call.

ATAC-seq chromatin accessibility

ATAC-seq (Assay for Transposase-Accessible Chromatin) identifies regions of open chromatin where regulatory factors can bind. Predicted ATAC signal provides a genome-wide view of chromatin accessibility at base-pair resolution across the queried interval. Requested output token: "ATAC"
Extractor function: alphagenome_get_atac(response_body)
Returns: list($values, $metadata) — a positions × tracks numeric matrix and a track annotation data frame.
library(AlphaGenomeR)

api_key <- Sys.getenv("ALPHAGENOME_API_KEY")
region  <- "chr17:42560601-43609177"

results <- alphagenome_query(
  access_token     = api_key,
  genomic_region   = region,
  ontology_terms   = c("UBERON:0002048"),  # Lung
  requested_outputs = c("ATAC")
)

atac_data <- alphagenome_get_atac(results)

# Prediction matrix: positions x tracks
dim(atac_data$values)

# Plot accessibility signal for the first track
plot(atac_data$values[, 1], type = "l",
     xlab = "Genomic position", ylab = "Predicted ATAC signal",
     main = "Chromatin accessibility")
ATAC-seq peaks correspond to columns in $values that have locally elevated signal. Use $metadata to identify which cell type each track represents before interpreting accessibility profiles.

DNase-seq hypersensitivity

DNase-seq measures DNase I hypersensitive sites (DHSs) — regions of open chromatin that are especially sensitive to nuclease cleavage. DHSs mark active regulatory elements including promoters, enhancers, and insulators. Requested output token: "DNASE"
Extractor function: alphagenome_get_dnase(response_body)
Returns: list($values, $metadata) — a positions × tracks numeric matrix of DNase hypersensitivity signal.
results <- alphagenome_query(
  access_token     = api_key,
  genomic_region   = region,
  ontology_terms   = c("UBERON:0002048"),
  requested_outputs = c("DNASE")
)

dnase_data <- alphagenome_get_dnase(results)

# Find hypersensitive site peaks
peak_positions <- which(dnase_data$values[, 1] > quantile(dnase_data$values[, 1], 0.95))
cat("DNase peak positions:", head(peak_positions), "\n")

ChIP-seq transcription factor binding

ChIP-seq for transcription factors (TFs) identifies genomic locations where specific TFs are bound. Predicted ChIP-TF signal spans all TFs in the model’s training corpus, with each track corresponding to a TF–cell type combination annotated in $metadata. Requested output token: "CHIP_TF"
Extractor function: alphagenome_get_chip_tf(response_body)
Returns: list($values, $metadata) — a positions × tracks numeric matrix of TF binding signal.
results <- alphagenome_query(
  access_token     = api_key,
  genomic_region   = region,
  ontology_terms   = c("UBERON:0002048"),
  requested_outputs = c("CHIP_TF")
)

chip_tf_data <- alphagenome_get_chip_tf(results)

# Number of TF tracks predicted
ncol(chip_tf_data$values)

# Inspect which TFs are covered
head(chip_tf_data$metadata)

ChIP-seq histone marks

Histone modification ChIP-seq captures the epigenetic state of chromatin. Different marks denote distinct regulatory states: H3K27ac marks active enhancers, H3K4me3 marks active promoters, H3K27me3 marks Polycomb-repressed regions, and so on. Requested output token: "CHIP_HISTONE"
Extractor function: alphagenome_get_chip_histone(response_body)
Returns: list($values, $metadata) — a positions × tracks numeric matrix of histone mark signal.
results <- alphagenome_query(
  access_token     = api_key,
  genomic_region   = region,
  ontology_terms   = c("UBERON:0002048"),
  requested_outputs = c("CHIP_HISTONE")
)

chip_histone_data <- alphagenome_get_chip_histone(results)

# View the histone marks and cell types covered
print(chip_histone_data$metadata)

Combined accessibility query

Request all four accessibility modalities together for a comprehensive regulatory landscape of your locus.
results <- alphagenome_query(
  access_token     = api_key,
  genomic_region   = region,
  ontology_terms   = c("UBERON:0002048"),  # Lung
  requested_outputs = c("ATAC", "DNASE", "CHIP_TF", "CHIP_HISTONE")
)

atac_data        <- alphagenome_get_atac(results)
dnase_data       <- alphagenome_get_dnase(results)
chip_tf_data     <- alphagenome_get_chip_tf(results)
chip_histone_data <- alphagenome_get_chip_histone(results)

# Overlay ATAC and DNase signals to validate open chromatin calls
par(mfrow = c(2, 1), mar = c(2, 4, 1, 1))
plot(atac_data$values[, 1],  type = "l", ylab = "ATAC")
plot(dnase_data$values[, 1], type = "l", ylab = "DNase")
Extractors return NULL for any modality not included in requested_outputs. Guard against this when writing reusable analysis functions:
atac_data <- alphagenome_get_atac(results)
if (!is.null(atac_data)) {
  # safe to use atac_data$values
}

Build docs developers (and LLMs) love