Documentation Index
Fetch the complete documentation index at: https://mintlify.com/samgutentag/bcycle-map/llms.txt
Use this file to discover all available pages before exploring further.
BCycle Map offloads everything that would exceed Workers’ 1 MiB script-size limit or its 10 ms CPU budget to GitHub Actions. That includes sealing KV snapshot buffers into R2 parquet, computing analytics rollups (leaderboards, popularity, route cache, travel-time matrix, typical availability profiles), deploying Workers on push to main, and backfilling historical data. This page lists every workflow, its schedule, what it does, and the repository secrets it needs.
Workflow reference
| Workflow | File | Schedule / Trigger | What it does |
|---|
parquet-compact | compact.yml | Every 3 h at :05 | Seals finished KV hour-buffers into R2 parquet partitions |
deploy-workers | deploy-workers.yml | Push to main / manual dispatch | Deploys changed Workers via Wrangler (per-worker change detection) |
corridors | corridors.yml | Daily 09:42 UTC / manual | Computes per-system corridor artifacts and writes systems-index.json to R2 |
leaderboards | leaderboards.yml | Daily 08:17 UTC / manual | Scans full parquet archive and writes top-station and top-route leaderboards |
popularity | popularity.yml | Every 4 h / manual | Computes station popularity rollup from R2 parquet |
routes | routes.yml | Daily 14:30 UTC / manual | Runs route-cache check; optionally calls Google Maps to compute/refresh GeoJSON routes |
travel-times | travel-times.yml | Daily 14:15 UTC / manual | Runs travel-time matrix check; optionally calls Google Maps to compute/refresh the matrix |
typicals | typicals.yml | Every 6 h at :25 / manual | Computes typical hourly availability profiles per station |
backfill-activity | backfill-activity.yml | Manual only | Replays compacted R2 parquet through activity-detection logic to backfill the activity log |
backfill-first-seen | backfill-first-seen.yml | Manual only | One-shot: backdates first_seen_ts from the earliest parquet partition so legacy stations don’t show the “NEW” badge |
backfill-recent24h | backfill-recent24h.yml | Manual only | One-shot: populates the rolling 24 h bikes_min/bikes_max window in KV from the last 23 hours of parquet partitions |
deploy-pages | deploy-pages.yml | Push to main / manual dispatch | Builds the Vite frontend and deploys it to GitHub Pages |
migrate-activity-to-r2 | migrate-activity-to-r2.yml | Manual only | One-shot: copies the activity log JSON from KV to R2 |
seed-test-trip | seed-test-trip.yml | Manual only | Synthesizes a single test Trip in the activity log paired with an existing poller arrival |
Workflow details
parquet-compact
The compaction workflow is the bridge between the live KV buffer and the durable R2 parquet archive. The poller writes one system:<id>:buffer:<yyyy-mm-dd-hh> KV key per system per UTC hour. When that hour ends (plus a 5-minute grace period), scripts/compact.ts reads each finished buffer via the KV REST API, converts the JSON snapshots to a columnar parquet file using parquet-wasm + apache-arrow, and uploads it to R2 at gbfs/<system_id>/station_status/dt=<yyyy-mm-dd>/<hh>.parquet. The KV buffer key is then deleted to reclaim write quota.
compact.yml runs every 3 hours (not every hour) because GitHub Actions scheduled crons are best-effort and skip runs under load. The compact script is idempotent — it seals every finished hour in a single pass — so a 3-hour cadence backfills all skipped hours cleanly without any data loss.
Trigger compaction manually at any time:
gh workflow run compact.yml
deploy-workers
See the Workers deployment page for full details on how per-Worker change detection works and how to target a specific Worker from the dispatch input.
corridors
Runs scripts/compute-corridors.ts, which reads station geometry from R2 parquet, groups stations into geographic corridors, and writes the results plus a cross-system gbfs/systems-index.json to R2. The read API’s GET /api/systems endpoint serves this file directly, so the network picker in the frontend reflects it on the next page load.
Run corridors manually after adding or disabling a system in systems.json:
gh workflow run corridors.yml
leaderboards
Runs scripts/compute-leaderboards.ts daily at 08:17 UTC — offset from the 4-hourly popularity run to avoid contending for the same R2 read bandwidth. It scans the full parquet archive (all-time) to compute top-N stations and routes, then writes the results to R2. Timeout is set to 45 minutes to accommodate a full year of hourly partitions on a cold runner cache.
popularity
Runs scripts/compute-popularity.ts every 4 hours. It computes a per-station popularity score (recent trip frequency) and writes a rollup to R2 that powers the /route average-trip-duration badge in the frontend.
routes
Runs scripts/compute-routes.ts daily at 14:30 UTC in check mode by default — a free diff that detects whether any stations have been added, moved, or removed since the last compute. If changes are found, the workflow files (or updates) a GitHub Issue labelled routes prompting a manual recompute.
Manual dispatch accepts a mode input:
| Mode | What it does |
|---|
check | Free diff only — detects changes, files an issue |
compute | Incremental Google Maps calls for changed station pairs only |
compute-full | Full rebuild of the GeoJSON route cache for all pairs |
travel-times
Runs scripts/compute-travel-times.ts daily at 14:15 UTC — 15 minutes before routes so the two jobs don’t overlap on R2 reads. Same check / compute / compute-full mode pattern as routes. On station changes, it files or updates a travel-times-labelled issue.
typicals
Runs scripts/compute-typicals.ts every 6 hours at :25 past the hour (offset from the compact run at :05 and the popularity run on the hour). It reads the R2 parquet archive and computes, per station, a typical hourly availability profile broken down by day-of-week once enough history exists (≥ 21 days). The read API’s GET /api/systems/:id/stations/:sid/recent endpoint reads these pre-computed profiles directly, keeping the request within the 10 ms CPU budget.
backfill-activity
A manual-only, idempotent workflow that replays the last N hours of compacted parquet through the same event-detection and trip-pairing logic the live poller uses, then writes the results back to the KV activity log. Accepts an hours_back input (default 3). Re-running is safe.
gh workflow run backfill-activity.yml -f hours_back=6
backfill-first-seen
A one-shot, manual-only helper that scans the earliest available parquet partition to find original station records, then backdates each station’s first_seen_ts in KV so that stations present from the beginning do not display the “NEW” badge. Re-running is safe — it overwrites KV with the same values.
gh workflow run backfill-first-seen.yml
backfill-recent24h
A one-shot, manual-only helper that populates the rolling 24 h bikes_min/bikes_max window in KV by reading the last 23 hours of compacted parquet partitions. Useful after a cold start or gap in compaction. Re-running overwrites the KV window with freshly computed values.
gh workflow run backfill-recent24h.yml
deploy-pages
Builds the Vite frontend with VITE_API_BASE pointed at the deployed read-API Worker and deploys the resulting dist/ directory to GitHub Pages. Triggers automatically on pushes to main that touch frontend files, or can be dispatched manually. Uses a public/CNAME check to determine the base path (/ for custom domains, /bcycle-map/ for the default GitHub Pages URL).
gh workflow run deploy-pages.yml
migrate-activity-to-r2
A one-shot, manual-only migration that copies the activity log JSON object from KV to R2. The KV value is left in place for manual cleanup. Idempotent — re-running overwrites the R2 object with whatever is currently in KV.
gh workflow run migrate-activity-to-r2.yml
seed-test-trip
A manual-only helper that synthesizes a single test Trip record in the activity log, pairing it with an arrival the poller has already captured. Useful for verifying the activity-log pipeline end-to-end without waiting for a real trip to be detected.
gh workflow run seed-test-trip.yml
Required GitHub repository secrets
All workflows that access Cloudflare resources or write to R2 require the following secrets. Set them in Settings → Secrets and variables → Actions:
| Secret | How to obtain it |
|---|
CF_ACCOUNT_ID | npx wrangler whoami — prints your account ID |
CLOUDFLARE_API_TOKEN | Cloudflare → My Profile → API Tokens → Edit Cloudflare Workers template |
CF_KV_API_TOKEN | Cloudflare → My Profile → API Tokens → Edit Cloudflare Workers KV Storage template |
CF_KV_NAMESPACE_ID | The production KV namespace id captured when running npx wrangler kv namespace create GBFS_KV |
R2_ACCESS_KEY_ID | Cloudflare → R2 → Manage R2 API Tokens → Create Token (Object Read & Write on bcycle-map-archive) |
R2_SECRET_ACCESS_KEY | Shown once when the R2 token is created — copy it immediately |
R2_BUCKET | bcycle-map-archive |
R2_SECRET_ACCESS_KEY is shown only once at token creation time. If you navigate away without copying it, you must rotate the token and update the secret.
CLOUDFLARE_API_TOKEN is used only by deploy-workers.yml. CF_KV_API_TOKEN is separate and scoped to KV read/write, used by compact.yml, backfill-activity.yml, and leaderboards.yml. This separation limits the blast radius if either token is compromised.
The routes and travel-times workflows also use GOOGLE_MAPS_API_KEY when running in compute or compute-full mode. This secret is optional for check mode and is not required for any other workflow.