Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/BDB-Genomics/AlphaGenomeR/llms.txt

Use this file to discover all available pages before exploring further.

AlphaGenome predicts multiple functional genomic signals from a single 1 MB DNA sequence in one API call. AlphaGenomeR exposes all 11 supported modalities through dedicated extractor functions, each returning a consistent list($values, $metadata) structure that integrates directly with standard R and Bioconductor workflows.

Supported modalities

Each modality is activated by passing its string token to requested_outputs in alphagenome_query(). The corresponding extractor function then pulls that modality’s data out of the response object.
FunctionModalityrequested_outputs value
alphagenome_get_rna_seq()RNA-seq gene expression"RNA_SEQ"
alphagenome_get_atac()ATAC-seq chromatin accessibility"ATAC"
alphagenome_get_cage()CAGE transcription start sites"CAGE"
alphagenome_get_dnase()DNase-seq hypersensitivity"DNASE"
alphagenome_get_chip_tf()ChIP-seq transcription factors"CHIP_TF"
alphagenome_get_chip_histone()ChIP-seq histone marks"CHIP_HISTONE"
alphagenome_get_splice_sites()Predicted splice sites"SPLICE_SITES"
alphagenome_get_splice_junctions()Splice junction predictions"SPLICE_JUNCTIONS"
alphagenome_get_splice_usage()Splice site usage"SPLICE_SITE_USAGE"
alphagenome_get_procap()PRO-cap capped RNA"PROCAP"
alphagenome_get_contact_maps()3D chromatin contact maps"CONTACT_MAPS"

Requesting multiple modalities in one call

Pass a character vector to requested_outputs to retrieve several modalities simultaneously. The model runs once and returns all requested signals, so batching is more efficient than making separate calls.
library(AlphaGenomeR)

api_key <- Sys.getenv("ALPHAGENOME_API_KEY")
region  <- "chr17:42560601-43609177"

results <- alphagenome_query(
  access_token     = api_key,
  genomic_region   = region,
  ontology_terms   = c("UBERON:0002048"),  # Lung
  requested_outputs = c("RNA_SEQ", "ATAC", "CAGE")
)

rna_data  <- alphagenome_get_rna_seq(results)
atac_data <- alphagenome_get_atac(results)
cage_data <- alphagenome_get_cage(results)

Return value structure

Every extractor function returns the same two-element list when the modality was requested, or NULL when it was not included in requested_outputs.
A numeric matrix with dimensions positions × tracks. Each row corresponds to a genomic position within the queried interval; each column corresponds to one experimental track (e.g., a specific cell line or tissue replicate). For contact maps, $values is a square matrix of contact frequencies.
A data.frame describing each track in $values. Columns typically include cell type, tissue, ontology term, and experimental details drawn from the original training data.
If a modality’s token was not included in requested_outputs, its extractor returns NULL. Always check the return value before accessing $values or $metadata.
atac_data <- alphagenome_get_atac(results)
if (!is.null(atac_data)) {
  plot(atac_data$values[, 1], type = "l")
}
All extractor functions accept the raw response_body object returned by alphagenome_query(). Pass the full response — not an individual field — to each extractor.

Modality pages

Gene expression

RNA-seq, CAGE, and PRO-cap predictions for transcription and promoter activity.

Chromatin accessibility

ATAC-seq, DNase-seq, and ChIP-seq predictions for open chromatin and TF binding.

Splicing

Splice site positions, junction usage, and alternative splicing from sequence alone.

3D genome

Hi-C-like chromatin contact maps predicting 3D genome architecture.

Build docs developers (and LLMs) love