Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/BDB-Genomics/AlphaGenomeR/llms.txt

Use this file to discover all available pages before exploring further.

alphagenome_query() is the primary entry point for the AlphaGenomeR package. It authenticates with the Google DeepMind AlphaGenome API, maps your genomic region and requested output modalities to the underlying Python client, executes the prediction, and returns the raw response as an R list. Pass that list to any of the extractor functions (e.g., alphagenome_get_rna_seq()) to pull out individual modality data.

Function signature

alphagenome_query(
  access_token,
  genomic_region,
  organism = "HOMO_SAPIENS",
  requested_outputs = c("RNA_SEQ", "ATAC", "CAGE"),
  ontology_terms = NULL
)

Parameters

access_token
character
required
API key for the AlphaGenome API. Obtain this from your Google DeepMind API account. The key is passed directly to the underlying Python alphagenome client via ag_dna$create(api_key = access_token).
genomic_region
character
required
Genomic region to query in "chr:start-end" format. The chromosome, start position, and end position are parsed and used to construct an Interval object for the Python client.Example: "chr17:42560601-43609177"
The AlphaGenome model typically requires an interval of approximately 1 Mb. Regions that are too short or too long may be rejected by the API.
organism
character
default:"HOMO_SAPIENS"
Organism identifier string. Must match a valid Organism enum value in the alphagenome Python package. The default is human.Common values: "HOMO_SAPIENS", "MUS_MUSCULUS"
requested_outputs
character[]
default:"c(\"RNA_SEQ\", \"ATAC\", \"CAGE\")"
Character vector of modality names to request from the API. Each value is mapped to an OutputType enum in the Python client. Only modalities listed here will be present in the response; all others return NULL from their extractor functions.Valid values:
ValueExtractor function
"RNA_SEQ"alphagenome_get_rna_seq()
"ATAC"alphagenome_get_atac()
"CAGE"alphagenome_get_cage()
"DNASE"alphagenome_get_dnase()
"CHIP_TF"alphagenome_get_chip_tf()
"CHIP_HISTONE"alphagenome_get_chip_histone()
"SPLICE_SITES"alphagenome_get_splice_sites()
"SPLICE_JUNCTIONS"alphagenome_get_splice_junctions()
"PROCAP"alphagenome_get_procap()
"CONTACT_MAPS"alphagenome_get_contact_maps()
ontology_terms
character[]
Optional character vector of UBERON or CL ontology terms for tissue or cell type filtering (e.g., "UBERON:0002048" for lung). When NULL (the default), no tissue filtering is applied and the API returns predictions across all available tissue contexts.

Return value

response
list
An R list produced by converting the Python prediction object via reticulate::py_to_r(). The list contains named slots corresponding to each requested output modality. Use the modality-specific extractor functions to access structured $values and $metadata from each slot.

Errors

The function will stop with an informative message under these conditions:
  • Missing access_token"API key is not provided."
  • Missing genomic_region"Genomic region is not provided."
  • Malformed genomic_region"genomic_region must be in 'chr:start-end' format."
  • Invalid organism — lists available organism enum values from the Python package
  • Invalid entry in requested_outputs — lists available OutputType enum values
  • alphagenome Python package not installed"The 'alphagenome' Python package is not installed. Please run: pip install alphagenome"

Example

library(AlphaGenomeR)

api_key <- Sys.getenv("ALPHAGENOME_API_KEY")

# Query RNA-seq, ATAC, and CAGE for a 1 Mb region on chr17
# filtered to lung tissue (UBERON:0002048)
results <- alphagenome_query(
  access_token     = api_key,
  genomic_region   = "chr17:42560601-43609177",
  organism         = "HOMO_SAPIENS",
  requested_outputs = c("RNA_SEQ", "ATAC", "CAGE"),
  ontology_terms   = c("UBERON:0002048")
)

# Extract each modality
rna_data    <- alphagenome_get_rna_seq(results)
atac_data   <- alphagenome_get_atac(results)
cage_data   <- alphagenome_get_cage(results)

# Inspect the RNA-seq output
dim(rna_data$values)    # numeric matrix: positions x tracks
head(rna_data$metadata) # data.frame with track annotations
Store your API key in an environment variable (e.g., via .Renviron) rather than hard-coding it in scripts. Use Sys.getenv("ALPHAGENOME_API_KEY") to retrieve it at runtime.

Build docs developers (and LLMs) love