BCycle Map is built entirely on Cloudflare primitives: three Workers handle polling, HTTP serving, and daily smoke testing; Cloudflare KV gives the live map sub-second reads of the latest station snapshot; Cloudflare R2 stores a growing parquet archive for historical analysis; GitHub Actions runs the compaction job (every 3 hours) that seals KV buffers into R2 parquet; and a React + Vite frontend renders everything in the browser. No dedicated servers, no managed databases, no egress fees.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/samgutentag/bcycle-map/llms.txt
Use this file to discover all available pages before exploring further.
System Overview
Hot Path vs. Cold Path
The architecture is deliberately split into two asymmetric read paths based on what each view needs from the data.Hot Path — Live Map
The live map needs the current state of every station, as fresh as possible, served in milliseconds. KV is the right store for exactly this pattern: one key, one value, one fast read.Cache-Control: max-age=60 so the Cloudflare edge caches responses between frontend polling cycles, keeping per-request KV reads to a minimum.
Cold Path — Explore View
The Explore view needs weeks or months of data for trend analysis. Sending that through a Worker API would be slow and expensive. Instead, the browser fetches parquet files from R2 directly and runs SQL queries locally using DuckDB-WASM — no server-side database at all.Typical Profiles — Station Details View
The Station Details view shows a per-station typical availability chart (bikes by hour of day, optionally split by day of week). These profiles are too expensive to compute on-the-fly from raw parquet on every request, so thecompute-popularity GitHub Action pre-computes them and writes one JSON file per station to R2. The read-api Worker serves them through a dedicated endpoint:
Cache-Control: max-age=300).
The day-of-week filter activates only once a system has at least 21 days of history (
daysCovered >= 21). Below that threshold all-days averages are shown because per-day-of-week samples are too sparse to be meaningful.Three Workers
Each Worker has a dedicatedwrangler*.toml config, its own name, and its own trigger type. They share the same KV namespace binding and R2 bucket binding so they can read and write each other’s data.
| Worker | Config file | Trigger | Job |
|---|---|---|---|
bcycle-map-poller | wrangler.toml | Cron every 5 minutes (*/5 * * * *) | Fetch GBFS feeds → normalize() → write KV latest + KV buffer |
bcycle-map-read-api | wrangler.read-api.toml | HTTP (fetch handler) | Serve KV snapshots, R2 activity logs, parquet partition lists, trip inference, analytics |
bcycle-map-smoke | wrangler.smoke.toml | Daily cron at 09:00 UTC (0 9 * * *) | Fetch the real GBFS feed, run normalize(), file a GitHub Issue if the shape check fails |
The smoke Worker is a canary for upstream schema changes. If BCycle silently changes their GBFS payload shape,
normalize() will throw, the smoke Worker catches it, and a labeled GitHub Issue is filed automatically — before the prod poller silently drops cycles.Storage
Cloudflare KV
KV is the hot-path store. Two key patterns are maintained per system:| Key pattern | Contents | Written by | Read by |
|---|---|---|---|
system:<id>:latest | Full KVValue JSON: system, snapshot_ts, all station snapshots, max_bikes_ever, recent24h sparkline data, last_total_changed_ts | Poller (every 5 min) | Read-api Worker (live map) |
system:<id>:buffer:<YYYY-MM-DD-HH> | Array of BufferEntry objects — lightweight per-tick records (station IDs + availability counts only) accumulated throughout the hour | Poller (every 5 min, append) | GitHub Actions compaction (read → seal → delete) |
Cloudflare R2
R2 is the cold-path store. Objects are organized into two categories: Parquet partitions (written by GitHub Actions compaction):gbfs/bcycle_santabarbara/station_status/dt=2026-05-13/14.parquet holds all snapshots from the 14:00 UTC hour on 2026-05-13. Each parquet file contains flattened station rows: snapshot_ts, station_id, num_bikes_available, num_docks_available, bikes_electric, bikes_classic, bikes_smart, is_installed, is_renting, is_returning, last_reported.
Operational objects (written by the poller and compute scripts):
| R2 key | Contents | Written by |
|---|---|---|
gbfs/<id>/activity.json | ActivityLog — departure/arrival events and inferred trips, capped to the 50 most recent entries | Poller |
gbfs/<id>/travel-times.json | Station-to-station travel-time matrix used for greedy trip inference | compute-routes npm script (run via GitHub Action) |
gbfs/<id>/typicals/<station_id>.json | Pre-computed typical availability profiles (by hour, optionally by day-of-week) | compute-popularity npm script (run via GitHub Action) |
gbfs/systems-index.json | System metadata list served by GET /api/systems | corridors GitHub Action |
Compaction
The intra-hour KV buffer is compact by design: eachBufferEntry stores only the dynamic fields (availability counts and flags), not the full station metadata. At the top of each hour, the GitHub Actions compaction workflow:
- Lists all
system:<id>:buffer:<YYYY-MM-DD-HH>keys older than the current hour - Reads each buffer from KV via the Cloudflare KV REST API
- Joins the dynamic buffer entries against the latest station metadata to reconstruct full rows
- Encodes rows as columnar parquet using
parquet-wasm+apache-arrowin Node - Writes the sealed parquet file to R2 at the hive-partitioned path
- Deletes the KV buffer key
Compaction intentionally runs in GitHub Actions, not in a Worker. The
parquet-wasm + apache-arrow dependency bundle exceeds the Cloudflare Workers 1 MiB script size limit. Running compaction in GitHub Actions sidesteps this constraint without requiring a Workers Paid plan, and GitHub Actions free-tier minutes are more than sufficient for the 3-hour cadence (5 */3 * * * cron).The normalize() Anti-Corruption Layer
All GBFS version-specific parsing lives inside normalize() in src/shared/normalize.ts. The three entry points — normalizeStationInformation(), normalizeStationStatus(), and normalizeSystemInformation() — accept raw GBFS JSON and return the project’s internal typed shapes (StationStatic[], StationDynamic[], SystemInfo). Everything downstream — the KV writer, the parquet encoder, the frontend, the tests — works exclusively on these internal types.
This means:
- Adding GBFS v2.x support later means adding normalizers inside
normalize.ts. Nothing else changes. - The smoke Worker catches upstream schema changes by running
normalize()against the live feed daily. - Test fixtures are captured real GBFS responses. The unit tests exercise
normalize()directly against those fixtures, giving high confidence that the anti-corruption layer holds.
Poll Cadence and Free-Tier Budget
The poller cron expression is*/5 * * * * — every 5 minutes. This cadence is intentional:
wrangler tail sessions, and the occasional manual workflow trigger without ever threatening the daily cap for a single active system.
Related Documentation
API Reference
Every endpoint the read-api Worker exposes:
/current, /activity, /trips, /snapshots, /partitions, and more.Managing Systems
How to add, enable, or disable a GBFS system in
systems.json and what downstream jobs need to re-run.Compaction Pipeline
Deep-dive into the GitHub Actions workflow that seals KV buffers into hive-partitioned R2 parquet.
Quickstart
Clone, test, and deploy BCycle Map from scratch in a single guided walkthrough.