Overview
Dubly automatically tracks every click on your short links, capturing detailed analytics data including IP address, user agent, referer, geolocation, browser information, device type, and operating system.Data Collection Pipeline
Non-blocking Architecture
Analytics collection is designed to never slow down redirects. When a click occurs, the data is pushed to an in-memory buffer and immediately returns:If the analytics buffer is full, events are silently dropped rather than slowing down the redirect. This ensures short links remain fast even under extreme load.
Buffering and Flush Behavior
The analytics collector uses a buffered channel configured via environment variables:DUBLY_BUFFER_SIZE(default: 50,000): Maximum events to buffer in memoryDUBLY_FLUSH_INTERVAL(default: 30s): How often to flush buffered events to database
Metrics Collected
Each click captures the following data:Request Metadata
- IP Address: Client IP (respects X-Forwarded-For and X-Real-IP headers)
- User Agent: Full user agent string
- Referer: HTTP referer header
- Referer Domain: Extracted hostname from referer URL
- Timestamp: Click time in UTC
Parsed User Agent Data
- Browser: Browser name (e.g., “Chrome”, “Firefox”)
- Browser Version: Full version string
- Operating System: OS name (e.g., “Windows”, “macOS”, “Linux”)
- Device Type: One of
desktop,mobile, orbot
Geolocation Data
If GeoIP is configured (DUBLY_GEOIP_PATH environment variable), Dubly enriches each click with:
- Country: ISO country code
- City: City name
- Region: State or region
- Latitude/Longitude: Geographic coordinates
Data Enrichment
Raw click events are enriched before insertion using theenrich function in internal/analytics/collector.go:101:
Bot and Datacenter Filtering
Dubly automatically filters out bot traffic and requests from known datacenters to provide accurate human click analytics.Bot Detection
Bot filtering uses a two-tier approach in internal/analytics/bot.go:- User agent library detection: Uses the
mssola/useragentlibrary’s built-in bot detection - Signature matching: Checks user agent against a comprehensive list of bot signatures
- Generic bot patterns (“bot”, “spider”, “crawl”)
- Link preview fetchers (WhatsApp, Slack, Telegram, Facebook, Twitter)
- Search engine bots (Google, Bing)
- HTTP client libraries (curl, wget, python-requests, Go HTTP client)
- Headless browsers (PhantomJS, Puppeteer)
- Security scanners
Datacenter Filtering
Dubly maintains an in-memory list of datacenter IP ranges and threat blocklists, automatically refreshed every 24 hours from multiple sources: Cloud Provider Ranges:- AWS, Google Cloud, Azure (via main datacenter list)
- Oracle Cloud Infrastructure (OCI)
- DigitalOcean
- Vultr
- Akamai CDN
- Scaleway
- Tor exit nodes
- IPsum threat feed
- GreenSnow blocklist
Datacenter filtering happens before analytics collection. Requests from datacenters and known threat IPs are still redirected, but clicks are not recorded.
Analytics Collection Logic
The redirect handler checks both bot and datacenter status before recording analytics in internal/handlers/redirect.go:65:Database Storage
Enriched click events are batch-inserted into the database using a transaction in internal/models/click.go:28:Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
DUBLY_BUFFER_SIZE | 50000 | Maximum click events to buffer in memory |
DUBLY_FLUSH_INTERVAL | 30s | How often to flush events to database |
DUBLY_GEOIP_PATH | "" | Path to MaxMind GeoIP database file |
GeoIP Setup
To enable geolocation enrichment:- Download a MaxMind GeoLite2 or GeoIP2 City database
- Set
DUBLY_GEOIP_PATHto the database file path - Restart Dubly