Spinney: Observable Web Scraper for Node.js

Spinney is a TypeScript-first web scraping library that turns a single URL into a live stream of crawled pages. Built on top of RxJS, it models each scrape as an Observable — you subscribe, receive discovered URLs and parsed content as events, then unsubscribe when done. Spinney automatically reads and respects robots.txt rules, follows XML sitemaps, deduplicates visited links, and retries failed requests with exponential backoff.

Installation

Add Spinney to your project with npm, yarn, or pnpm in seconds.

Quickstart

Build your first scraper and start streaming crawled URLs in under 5 minutes.

API Reference

Full reference for the Spinney class, constructor options, and all public methods.

Guides

Learn how Spinney handles robots.txt, sitemaps, error handling, and configuration.

Why Spinney?

Spinney removes boilerplate from web crawling by giving you a reactive, event-driven interface instead of a callback soup. Feed it a URL and subscribe to its output — every crawled page emits a next event, parse events fire per element, and complete fires when the entire site graph is exhausted.

Observable API

robots.txt Aware

Automatically fetches and enforces Disallow rules before crawling begins.

Sitemap Support

Detects XML sitemaps and uses them as the crawl seed list when available.

Auto Retry

Failed requests retry up to 5 times with incrementally increasing timeouts.

Deduplication

Tracks every visited URL in a Set — no page is fetched or emitted twice.

TypeScript First

Full type definitions ship with the package. No @types package needed.

Quick Look

quickstart.ts

import Spinney from 'spinney';

const scraper = new Spinney('https://example.com/');

const subscription = scraper.subscribe({
  next(url) {
    console.log('Crawled:', url);
  },
  onattribute(name, value) {
    if (name === 'href') console.log('Found link:', value);
  },
  error(err) {
    console.error('Scrape error:', err);
  },
  complete() {
    console.log('Crawl complete');
  },
});

// Stop at any time
subscription.unsubscribe();

Install the package

Add Spinney to your project using your preferred package manager.

Create a Spinney instance

Pass your target URL to new Spinney(). Optionally provide Options and Axios config.

Subscribe to events

Call .subscribe() with next, error, complete, and HTML parser callbacks.

Unsubscribe when done

Hold the returned Subscription and call .unsubscribe() to stop crawling at any time.

Get Started

Guides

API Reference

Spinney: Observable Web Scraper for Node.js

Installation

Quickstart

API Reference

Guides

Why Spinney?

Observable API

robots.txt Aware

Sitemap Support

Auto Retry

Deduplication

TypeScript First

Quick Look

Build docs developers (and LLMs) love

Get Started

Guides

API Reference

Documentation Index

Installation

Quickstart

API Reference

Guides

​Why Spinney?

Observable API

robots.txt Aware

Sitemap Support

Auto Retry

Deduplication

TypeScript First

​Quick Look

Build docs developers (and LLMs) love

Why Spinney?

Quick Look