The TTS module in foliate-js converts e-book content to SSML (Speech Synthesis Markup Language) documents. It does not produce audio itself — instead, its methods return completeDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/johnfactotum/foliate-js/llms.txt
Use this file to discover all available pages before exploring further.
<speak> XML strings that you pass to the browser’s Web Speech API, a cloud synthesizer, or any other speech engine. The module understands the structure of the document well enough to walk through it block by block, maintain language markup, and preserve phoneme annotations written with ssml:ph and ssml:alphabet attributes.
Initializing TTS for the current section
Callview.initTTS() to prepare the TTS engine for whichever section is currently loaded:
| Parameter | Type | Default | Description |
|---|---|---|---|
granularity | 'word' | 'sentence' | 'word' | How the text is segmented. 'word' moves segment by segment; 'sentence' moves sentence by sentence. |
highlight | (range: Range) => void | scroll to range | Callback invoked with a DOM Range each time the current word or sentence changes. Use this to visually highlight the spoken text. |
view.initTTS() creates a TTS instance and assigns it to view.tts. If TTS is already initialized for the current section (i.e., the section has not changed), calling it again is a no-op.
Call
view.initTTS() again after the user navigates to a new section. The TTS instance is tied to a single section document.The TTS class methods
After calling view.initTTS(), use view.tts to control playback:
Advances to the next block and returns an SSML string for that block. Returns
undefined when the end of the section is reached. Pass true for paused to trigger the highlight callback for the block without synthesizing.Moves to the previous block and returns its SSML string.
Returns the SSML string for the first block (or resumes from the last known mark position). Call this when starting playback from the beginning of the section.
Returns the SSML string for the current block, starting from the last mark that was set. Use this after a pause to resume mid-block.
Returns the SSML string for the block that contains the given DOM
Range, starting from the position within that block that best matches the range. Useful for starting TTS from a user’s text selection.Called by your speech engine’s boundary/mark event handler to tell the TTS module which
<mark> element was just reached. This keeps the highlight callback in sync with the synthesizer’s progress. mark is the name string from the SSML <mark> element.The SSML output format
Each call tonext(), prev(), start(), or resume() returns a serialized XML string like:
<mark> element names a word or sentence boundary (depending on granularity). The marks correspond directly to the DOM ranges tracked internally, so setMark(name) can fire the highlight callback with the matching range.
The module also preserves EPUB content semantics:
<em>and<strong>become SSML<emphasis><br>becomes SSML<break>langattributes become SSMLxml:langssml:phattributes become SSML<phoneme ph="...">ssml:alphabetattributes set thealphabetattribute on<phoneme>elements
There is no support for PLS lexicons or CSS Speech properties.
Example: integrating with the Web Speech API
Using a cloud TTS service with full SSML
Navigating between sections
view.tts is scoped to the currently loaded section. When the reader moves to a new section, re-initialize: