Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/samgutentag/bcycle-map/llms.txt

Use this file to discover all available pages before exploring further.

BCycle Map offloads everything that would exceed Workers’ 1 MiB script-size limit or its 10 ms CPU budget to GitHub Actions. That includes sealing KV snapshot buffers into R2 parquet, computing analytics rollups (leaderboards, popularity, route cache, travel-time matrix, typical availability profiles), deploying Workers on push to main, and backfilling historical data. This page lists every workflow, its schedule, what it does, and the repository secrets it needs.

Workflow reference

WorkflowFileSchedule / TriggerWhat it does
parquet-compactcompact.ymlEvery 3 h at :05Seals finished KV hour-buffers into R2 parquet partitions
deploy-workersdeploy-workers.ymlPush to main / manual dispatchDeploys changed Workers via Wrangler (per-worker change detection)
corridorscorridors.ymlDaily 09:42 UTC / manualComputes per-system corridor artifacts and writes systems-index.json to R2
leaderboardsleaderboards.ymlDaily 08:17 UTC / manualScans full parquet archive and writes top-station and top-route leaderboards
popularitypopularity.ymlEvery 4 h / manualComputes station popularity rollup from R2 parquet
routesroutes.ymlDaily 14:30 UTC / manualRuns route-cache check; optionally calls Google Maps to compute/refresh GeoJSON routes
travel-timestravel-times.ymlDaily 14:15 UTC / manualRuns travel-time matrix check; optionally calls Google Maps to compute/refresh the matrix
typicalstypicals.ymlEvery 6 h at :25 / manualComputes typical hourly availability profiles per station
backfill-activitybackfill-activity.ymlManual onlyReplays compacted R2 parquet through activity-detection logic to backfill the activity log
backfill-first-seenbackfill-first-seen.ymlManual onlyOne-shot: backdates first_seen_ts from the earliest parquet partition so legacy stations don’t show the “NEW” badge
backfill-recent24hbackfill-recent24h.ymlManual onlyOne-shot: populates the rolling 24 h bikes_min/bikes_max window in KV from the last 23 hours of parquet partitions
deploy-pagesdeploy-pages.ymlPush to main / manual dispatchBuilds the Vite frontend and deploys it to GitHub Pages
migrate-activity-to-r2migrate-activity-to-r2.ymlManual onlyOne-shot: copies the activity log JSON from KV to R2
seed-test-tripseed-test-trip.ymlManual onlySynthesizes a single test Trip in the activity log paired with an existing poller arrival

Workflow details

parquet-compact

The compaction workflow is the bridge between the live KV buffer and the durable R2 parquet archive. The poller writes one system:<id>:buffer:<yyyy-mm-dd-hh> KV key per system per UTC hour. When that hour ends (plus a 5-minute grace period), scripts/compact.ts reads each finished buffer via the KV REST API, converts the JSON snapshots to a columnar parquet file using parquet-wasm + apache-arrow, and uploads it to R2 at gbfs/<system_id>/station_status/dt=<yyyy-mm-dd>/<hh>.parquet. The KV buffer key is then deleted to reclaim write quota.
compact.yml runs every 3 hours (not every hour) because GitHub Actions scheduled crons are best-effort and skip runs under load. The compact script is idempotent — it seals every finished hour in a single pass — so a 3-hour cadence backfills all skipped hours cleanly without any data loss.
Trigger compaction manually at any time:
gh workflow run compact.yml

deploy-workers

See the Workers deployment page for full details on how per-Worker change detection works and how to target a specific Worker from the dispatch input.

corridors

Runs scripts/compute-corridors.ts, which reads station geometry from R2 parquet, groups stations into geographic corridors, and writes the results plus a cross-system gbfs/systems-index.json to R2. The read API’s GET /api/systems endpoint serves this file directly, so the network picker in the frontend reflects it on the next page load. Run corridors manually after adding or disabling a system in systems.json:
gh workflow run corridors.yml

leaderboards

Runs scripts/compute-leaderboards.ts daily at 08:17 UTC — offset from the 4-hourly popularity run to avoid contending for the same R2 read bandwidth. It scans the full parquet archive (all-time) to compute top-N stations and routes, then writes the results to R2. Timeout is set to 45 minutes to accommodate a full year of hourly partitions on a cold runner cache.

popularity

Runs scripts/compute-popularity.ts every 4 hours. It computes a per-station popularity score (recent trip frequency) and writes a rollup to R2 that powers the /route average-trip-duration badge in the frontend.

routes

Runs scripts/compute-routes.ts daily at 14:30 UTC in check mode by default — a free diff that detects whether any stations have been added, moved, or removed since the last compute. If changes are found, the workflow files (or updates) a GitHub Issue labelled routes prompting a manual recompute. Manual dispatch accepts a mode input:
ModeWhat it does
checkFree diff only — detects changes, files an issue
computeIncremental Google Maps calls for changed station pairs only
compute-fullFull rebuild of the GeoJSON route cache for all pairs

travel-times

Runs scripts/compute-travel-times.ts daily at 14:15 UTC — 15 minutes before routes so the two jobs don’t overlap on R2 reads. Same check / compute / compute-full mode pattern as routes. On station changes, it files or updates a travel-times-labelled issue.

typicals

Runs scripts/compute-typicals.ts every 6 hours at :25 past the hour (offset from the compact run at :05 and the popularity run on the hour). It reads the R2 parquet archive and computes, per station, a typical hourly availability profile broken down by day-of-week once enough history exists (≥ 21 days). The read API’s GET /api/systems/:id/stations/:sid/recent endpoint reads these pre-computed profiles directly, keeping the request within the 10 ms CPU budget.

backfill-activity

A manual-only, idempotent workflow that replays the last N hours of compacted parquet through the same event-detection and trip-pairing logic the live poller uses, then writes the results back to the KV activity log. Accepts an hours_back input (default 3). Re-running is safe.
gh workflow run backfill-activity.yml -f hours_back=6

backfill-first-seen

A one-shot, manual-only helper that scans the earliest available parquet partition to find original station records, then backdates each station’s first_seen_ts in KV so that stations present from the beginning do not display the “NEW” badge. Re-running is safe — it overwrites KV with the same values.
gh workflow run backfill-first-seen.yml

backfill-recent24h

A one-shot, manual-only helper that populates the rolling 24 h bikes_min/bikes_max window in KV by reading the last 23 hours of compacted parquet partitions. Useful after a cold start or gap in compaction. Re-running overwrites the KV window with freshly computed values.
gh workflow run backfill-recent24h.yml

deploy-pages

Builds the Vite frontend with VITE_API_BASE pointed at the deployed read-API Worker and deploys the resulting dist/ directory to GitHub Pages. Triggers automatically on pushes to main that touch frontend files, or can be dispatched manually. Uses a public/CNAME check to determine the base path (/ for custom domains, /bcycle-map/ for the default GitHub Pages URL).
gh workflow run deploy-pages.yml

migrate-activity-to-r2

A one-shot, manual-only migration that copies the activity log JSON object from KV to R2. The KV value is left in place for manual cleanup. Idempotent — re-running overwrites the R2 object with whatever is currently in KV.
gh workflow run migrate-activity-to-r2.yml

seed-test-trip

A manual-only helper that synthesizes a single test Trip record in the activity log, pairing it with an arrival the poller has already captured. Useful for verifying the activity-log pipeline end-to-end without waiting for a real trip to be detected.
gh workflow run seed-test-trip.yml

Required GitHub repository secrets

All workflows that access Cloudflare resources or write to R2 require the following secrets. Set them in Settings → Secrets and variables → Actions:
SecretHow to obtain it
CF_ACCOUNT_IDnpx wrangler whoami — prints your account ID
CLOUDFLARE_API_TOKENCloudflare → My Profile → API Tokens → Edit Cloudflare Workers template
CF_KV_API_TOKENCloudflare → My Profile → API Tokens → Edit Cloudflare Workers KV Storage template
CF_KV_NAMESPACE_IDThe production KV namespace id captured when running npx wrangler kv namespace create GBFS_KV
R2_ACCESS_KEY_IDCloudflare → R2 → Manage R2 API Tokens → Create Token (Object Read & Write on bcycle-map-archive)
R2_SECRET_ACCESS_KEYShown once when the R2 token is created — copy it immediately
R2_BUCKETbcycle-map-archive
R2_SECRET_ACCESS_KEY is shown only once at token creation time. If you navigate away without copying it, you must rotate the token and update the secret.
CLOUDFLARE_API_TOKEN is used only by deploy-workers.yml. CF_KV_API_TOKEN is separate and scoped to KV read/write, used by compact.yml, backfill-activity.yml, and leaderboards.yml. This separation limits the blast radius if either token is compromised.
The routes and travel-times workflows also use GOOGLE_MAPS_API_KEY when running in compute or compute-full mode. This secret is optional for check mode and is not required for any other workflow.

Build docs developers (and LLMs) love