Documentation Index
Fetch the complete documentation index at: https://mintlify.com/clyrisai/gitresolve/llms.txt
Use this file to discover all available pages before exploring further.
PuppeteerProvider drives a real headless Chromium browser through the Puppeteer library. Unlike FetchProvider, it executes JavaScript on every page it visits, which means single-page applications, lazy-loaded content, and client-side-rendered portfolios are all resolved correctly. A single browser process is launched on first use and reused across multiple getPageContent calls; each URL gets its own fresh page that is closed immediately after.
Installation
Puppeteer is listed as a direct dependency of@clyrisai/gitresolve and is downloaded automatically:
BROWSER_PROVIDER=puppeteer gitresolve reports that the provider is unavailable, install Puppeteer globally alongside it:
Usage
Direct instantiation
Always callprovider.cleanup() in a finally block so the browser process is terminated even if an error occurs:
Via the factory
CLI
To usePuppeteerProvider from the command line, set the BROWSER_PROVIDER environment variable:
How it works
Lazy browser launch
The first call to
getPageContent triggers ensureBrowser(), which imports Puppeteer dynamically and launches Chromium with the flags --no-sandbox and --disable-setuid-sandbox. Subsequent calls reuse the same browser instance.New page per URL
For every
getPageContent call, a fresh browser page (browser.newPage()) is created. This prevents cookies, local storage, and cached state from leaking between requests.Navigation and content extraction
Puppeteer navigates to the URL using
page.goto(url, { waitUntil, timeout }) and then calls page.content() to retrieve the fully rendered HTML — including all content injected by JavaScript.Page teardown
The page is closed in a
finally block after each call, regardless of whether navigation succeeded or threw an error.Launch arguments
Chromium is always started with the following flags:| Flag | Purpose |
|---|---|
--no-sandbox | Required in Docker containers and most CI environments where the kernel sandbox is restricted |
--disable-setuid-sandbox | Disables the setuid sandbox as a complementary measure for the same environments |
Options
| Option | Type | Default | Description |
|---|---|---|---|
timeout | number | 30000 | Maximum milliseconds to wait for the page to reach the waitUntil state before throwing a timeout error. |
waitUntil | string | 'networkidle2' | Defines when navigation is considered complete. See the table below. |
waitUntil values
| Value | When navigation completes |
|---|---|
'load' | The load event fires — all synchronous resources (scripts, stylesheets, images) are loaded. Fastest, but may miss late-rendered content. |
'domcontentloaded' | The DOMContentLoaded event fires — the DOM is parsed but external resources may still be loading. |
'networkidle0' | No more than 0 network connections for at least 500 ms. Safest for heavily async apps, but slowest. |
'networkidle2' | No more than 2 network connections for at least 500 ms. Good balance of completeness and speed — this is the default. |