Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/BDB-Genomics/AlphaGenomeR/llms.txt

Use this file to discover all available pages before exploring further.

AlphaGenome predicts splicing patterns purely from DNA sequence, with no RNA-seq input required. Three complementary splicing modalities are available: the positions of donor and acceptor splice sites, predicted junction usage frequencies, and per-site splice usage. Together they provide a detailed view of how a locus is spliced across cell types and conditions.

Splice sites

The splice sites modality predicts the positions of 5’ donor and 3’ acceptor sites across the queried interval. Each position in $values receives a score reflecting the model’s confidence that a canonical splice signal exists at that base. Requested output token: "SPLICE_SITES"
Extractor function: alphagenome_get_splice_sites(response_body)
Returns: list($values, $metadata) — a positions × tracks numeric matrix of splice site scores.
library(AlphaGenomeR)

api_key <- Sys.getenv("ALPHAGENOME_API_KEY")
region  <- "chr17:42560601-43609177"

results <- alphagenome_query(
  access_token     = api_key,
  genomic_region   = region,
  ontology_terms   = c("UBERON:0002048"),  # Lung
  requested_outputs = c("SPLICE_SITES")
)

splice_sites_data <- alphagenome_get_splice_sites(results)

# Dimensions: positions x tracks
dim(splice_sites_data$values)

# High-confidence splice site positions
strong_sites <- which(splice_sites_data$values[, 1] > 0.5)
cat("Predicted splice sites:", length(strong_sites), "\n")

Splice junctions

Splice junctions represent pairs of donor and acceptor sites that are used together, defining the introns that are removed from a pre-mRNA. Predicted junction usage reflects how frequently each donor–acceptor pair is spliced across the cell types in the model. Requested output token: "SPLICE_JUNCTIONS"
Extractor function: alphagenome_get_splice_junctions(response_body)
Returns: list($values, $metadata) — a positions × tracks numeric matrix of junction usage scores.
results <- alphagenome_query(
  access_token     = api_key,
  genomic_region   = region,
  ontology_terms   = c("UBERON:0002048"),
  requested_outputs = c("SPLICE_JUNCTIONS")
)

splice_junctions_data <- alphagenome_get_splice_junctions(results)

# Inspect metadata for junction annotations
head(splice_junctions_data$metadata)

Splice site usage

Splice site usage captures the relative frequency with which individual donor and acceptor sites are utilized. While splice junctions describe pairs of sites, splice usage focuses on each site independently — useful for detecting alternative 5’ or 3’ splice site switching events. Extractor function: alphagenome_get_splice_usage(response_body)
Returns: list($values, $metadata) or NULL — a positions × tracks numeric matrix of per-site usage fractions.
# Request splicing outputs; splice site usage is populated alongside them
results <- alphagenome_query(
  access_token     = api_key,
  genomic_region   = region,
  ontology_terms   = c("UBERON:0002048"),
  requested_outputs = c("SPLICE_SITES", "SPLICE_JUNCTIONS")
)

splice_usage_data <- alphagenome_get_splice_usage(results)

if (!is.null(splice_usage_data)) {
  # Distribution of site usage values
  hist(splice_usage_data$values[, 1],
       xlab = "Splice site usage", main = "Splice usage distribution")
}
alphagenome_get_splice_usage() reads the splice_site_usage attribute from the Python response directly — it does not require a separate requested_outputs token. The function returns NULL if splice site usage data is not present in the response.

Querying all three splicing outputs

Request the three splicing modalities together for a complete picture of splice site architecture at your locus.
1

Query the API with splicing tokens

results <- alphagenome_query(
  access_token     = api_key,
  genomic_region   = region,
  ontology_terms   = c("UBERON:0002048"),  # Lung
  requested_outputs = c("SPLICE_SITES", "SPLICE_JUNCTIONS")
)
Splice site usage data is returned alongside the other splicing outputs — no additional token is needed.
2

Extract each splicing modality

splice_sites_data    <- alphagenome_get_splice_sites(results)
splice_junctions_data <- alphagenome_get_splice_junctions(results)
splice_usage_data    <- alphagenome_get_splice_usage(results)
3

Visualize splice site scores

par(mfrow = c(2, 1), mar = c(2, 4, 2, 1))
plot(splice_sites_data$values[, 1],
     type = "l", ylab = "Splice site score", main = "Splice sites")
plot(splice_junctions_data$values[, 1],
     type = "l", ylab = "Junction usage", main = "Splice junctions")
Combine splicing predictions with RNA-seq by adding "RNA_SEQ" to requested_outputs. Overlaying expression signal with junction usage helps identify which alternative splice forms are expressed in your tissue of interest.
requested_outputs = c("RNA_SEQ", "SPLICE_SITES", "SPLICE_JUNCTIONS")

Build docs developers (and LLMs) love