EPUB CFI: canonical fragment identifiers for e-books

Page numbers are meaningless for reflowable text: the same paragraph lands on different pages depending on font size, screen width, and user preferences. EPUB CFIs (Canonical Fragment Identifiers) solve this by addressing content as a path through the document tree rather than a positional offset. A CFI like epubcfi(/6/4!/4/2/1:42) encodes the exact node and character offset within the spine item, so a bookmark remains valid across re-renders, devices, and reading apps. foliate-js uses CFIs as the primary location format — the relocate event exposes a cfi property, and goTo() accepts CFI strings directly.

What a CFI looks like

A CFI is always wrapped in epubcfi(...). The path before the ! (the indirect reference) identifies the spine item in the package document. The path after it identifies the node within that spine item’s HTML:

epubcfi(/6/4!/4/2/1:42)
         ↑  ↑  ↑ ↑ ↑  ↑
         │  │  └─┴─┴──┴─ path within the HTML document
         │  └── spine item at index 4 (the 2nd item, using even indices)
         └───── spine element at /6 in the package document

Range CFIs have a comma-separated form for marking a span of text:

epubcfi(/6/4!/2,/2,/4)

How foliate-js represents parsed CFIs

The epubcfi.js module parses CFI strings into plain JavaScript values, not class instances. There are two structural forms.

The “part” object

The atomic unit is a part object, corresponding to one step and its optional offset in the CFI:

{
  "index": 4,
  "id": "chapter-start",
  "offset": 42,
  "temporal": 3.5,
  "spatial": [10, 20],
  "text": ["before", "after"],
  "side": "a"
}

Only index is typically present in practice. The other fields correspond to optional CFI constructs:

Field	CFI construct	Description
`index`	`/n`	Child node index (always even for elements, odd for text nodes)
`id`	`[id]`	ID assertion for the element at this step
`offset`	`:n`	Character offset within a text node
`temporal`	`~n`	Temporal offset (for audio/video media)
`spatial`	`@x:y`	Spatial offset (x and y coordinates)
`text`	`[before,after]`	Text location assertion
`side`	`;s=a` or `;s=b`	Side bias for ambiguous positions

Collapsed (non-range) CFIs

A collapsed CFI is an array of arrays of parts. Each inner array is one full path segment separated by !. For example, /6/4!/4 parses into:

[
  [
    { "index": 6 },
    { "index": 4 }
  ],
  [
    { "index": 4 }
  ]
]

The first element [{index:6},{index:4}] is the path through the package document (spine element, then spine item). The second element [{index:4}] is the path within the HTML document.

Range CFIs

A range CFI parses into an object with parent, start, and end properties, each being an array of arrays of parts (the same type as a collapsed CFI):

{
  "parent": [
    [
      { "index": 6 },
      { "index": 4 }
    ],
    [
      { "index": 2 }
    ]
  ],
  "start": [
    [
      { "index": 2 }
    ]
  ],
  "end": [
    [
      { "index": 4 }
    ]
  ]
}

This represents /6/4!/2,/2,/4 — the shared parent path plus the start and end offsets within the document.

Key exports from `epubcfi.js`

`isCFI`

A RegExp that tests whether a string is wrapped in epubcfi(...):

import * as CFI from './foliate-js/epubcfi.js'

CFI.isCFI.test('epubcfi(/6/4!/4)')  // true
CFI.isCFI.test('/6/4!/4')           // false

`parse(cfi)`

Parses a CFI string into the nested array (for collapsed CFIs) or { parent, start, end } object (for range CFIs) described above. Accepts both bare CFI paths and the full epubcfi(...) form:

const parsed = CFI.parse('epubcfi(/6/4!/4/2/1:42)')
// [
//   [{ index: 6 }, { index: 4 }],
//   [{ index: 4 }, { index: 2 }, { index: 1, offset: 42 }]
// ]

const rangeParsed = CFI.parse('epubcfi(/6/4!/2,/2,/4)')
// { parent: [...], start: [...], end: [...] }

The parser is a state machine, not a regex, and correctly handles escape sequences within assertions.

`fromRange(range, filter?)`

Creates a CFI string from a DOM Range:

const selection = window.getSelection()
if (selection.rangeCount > 0) {
  const range = selection.getRangeAt(0)
  const cfi = CFI.fromRange(range)
  console.log(cfi) // e.g., "epubcfi(/4/2/1:12,/4/2/1:30)"
}

For a collapsed range (a cursor position), returns a non-range CFI. For a non-collapsed range (a selection), returns a range CFI.

`toRange(doc, cfi, filter?)`

Resolves a CFI string to a DOM Range within the given document:

const range = CFI.toRange(doc, 'epubcfi(/4/2/1:12)')
if (range) {
  const selection = doc.defaultView.getSelection()
  selection.removeAllRanges()
  selection.addRange(range)
}

`joinIndir(...cfis)`

Joins CFI parts with the indirect reference operator (!). Used by view.js to combine a section’s base CFI with a within-document CFI:

const baseCFI = section.cfi           // e.g., "epubcfi(/6/4)"
const localCFI = CFI.fromRange(range) // e.g., "epubcfi(/4/2/1:12)"

const fullCFI = CFI.joinIndir(baseCFI, localCFI)
// "epubcfi(/6/4!/4/2/1:12)"

`fake.fromIndex(index)` and `fake.toIndex(parts)`

For book formats that have no real package document (MOBI, FictionBook, plain HTML), there is no genuine base CFI for each spine item. The fake helpers create and interpret index-based substitutes:

CFI.fake.fromIndex(0) // "epubcfi(/6/2)"
CFI.fake.fromIndex(1) // "epubcfi(/6/4)"
CFI.fake.fromIndex(2) // "epubcfi(/6/6)"

// Reverse: extract section index from a parsed base CFI part
const parts = CFI.parse('epubcfi(/6/4!/4/2/1:12)')
const index = CFI.fake.toIndex(parts.shift()) // 1

The formula is index → (index + 1) * 2, following the EPUB CFI convention that spine items appear at even indices under the spine element at /6.

Filtering injected nodes

If you inject your own elements into the section document (for search highlights, annotation markers, etc.), those nodes will corrupt CFI calculations unless you tell the parser to ignore them. Both fromRange and toRange accept an optional filter function that works like the filter callback of a TreeWalker:

import * as CFI from './foliate-js/epubcfi.js'

const filter = node => {
  if (node.nodeType !== 1) return NodeFilter.FILTER_ACCEPT
  if (node.matches('.annotation-marker')) return NodeFilter.FILTER_REJECT
  if (node.matches('.injected-wrapper')) return NodeFilter.FILTER_SKIP
  return NodeFilter.FILTER_ACCEPT
}

// Creating a CFI from a selection — injected nodes are ignored
const cfi = CFI.fromRange(range, filter)

// Resolving a stored CFI — injected nodes are skipped during traversal
const resolvedRange = CFI.toRange(doc, cfi, filter)

The three return values behave exactly as in TreeWalker:

NodeFilter.FILTER_ACCEPT — include this node in the CFI path
NodeFilter.FILTER_REJECT — exclude this node and all its descendants
NodeFilter.FILTER_SKIP — exclude this node but still traverse its children (the node is treated as transparent)

Comparing and sorting CFIs

epubcfi.js also exports a compare(a, b) function suitable for sorting arrays of CFI strings:

import { compare } from './foliate-js/epubcfi.js'

const cfis = [
  'epubcfi(/6/6!/4/2/1:10)',
  'epubcfi(/6/2!/4/2/1:5)',
  'epubcfi(/6/4!/4/2/1:80)',
]

cfis.sort(compare)
// [
//   'epubcfi(/6/2!/4/2/1:5)',
//   'epubcfi(/6/4!/4/2/1:80)',
//   'epubcfi(/6/6!/4/2/1:10)',
// ]

compare accepts either CFI strings or pre-parsed values and returns -1, 0, or 1.

Environment compatibility

epubcfi.js has no dependencies on browser-only globals. The parse, compare, joinIndir, and fake functions work in any JavaScript environment — Node.js, Deno, a service worker, or a browser — making them suitable for server-side bookmark sorting or validation. fromRange and toRange require a DOM Document and Range, so they are browser-only unless you supply a compatible DOM implementation.

// Node.js — parsing and sorting work fine
import * as CFI from './foliate-js/epubcfi.js'

const sorted = annotations
  .map(a => a.cfi)
  .sort(CFI.compare)

Most other foliate-js modules depend on Blob, TextDecoder, DOMParser, URL, and related globals. Only epubcfi.js is truly environment-agnostic for its core parsing and comparison features.

Current limitations

Spatial offsets (@x:y) and temporal offsets (~n) are parsed and round-tripped correctly by parse and the internal stringify logic, but the renderer does not yet use them for positioning. CFIs containing these constructs will resolve to the nearest node rather than the exact spatial or temporal position.

Getting Started

Core Concepts

Guides

EPUB CFI: canonical fragment identifiers for e-books

What a CFI looks like

How foliate-js represents parsed CFIs

The “part” object

Collapsed (non-range) CFIs

Range CFIs

Key exports from `epubcfi.js`

`isCFI`

`parse(cfi)`

`fromRange(range, filter?)`

`toRange(doc, cfi, filter?)`

`joinIndir(...cfis)`

`fake.fromIndex(index)` and `fake.toIndex(parts)`

Filtering injected nodes

Comparing and sorting CFIs

Environment compatibility

Current limitations

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Guides

Documentation Index

​What a CFI looks like

​How foliate-js represents parsed CFIs

​The “part” object

​Collapsed (non-range) CFIs

​Range CFIs

​Key exports from epubcfi.js

​isCFI

​parse(cfi)

​fromRange(range, filter?)

​toRange(doc, cfi, filter?)

​joinIndir(...cfis)

​fake.fromIndex(index) and fake.toIndex(parts)

​Filtering injected nodes

​Comparing and sorting CFIs

​Environment compatibility

​Current limitations

Build docs developers (and LLMs) love

What a CFI looks like

How foliate-js represents parsed CFIs

The “part” object

Collapsed (non-range) CFIs

Range CFIs

Key exports from `epubcfi.js`

`isCFI`

`parse(cfi)`

`fromRange(range, filter?)`

`toRange(doc, cfi, filter?)`

`joinIndir(...cfis)`

`fake.fromIndex(index)` and `fake.toIndex(parts)`

Filtering injected nodes

Comparing and sorting CFIs

Environment compatibility

Current limitations