1 — Cloud-Native by Design
1 — Cloud-Native by Design
Streaming workloads have highly irregular traffic patterns driven by viral content spikes that cannot be absorbed by statically provisioned infrastructure.
| Attribute | Detail |
|---|---|
| Tradeoffs | Vendor dependency risk; higher operational abstraction reduces fine-grained control. Mitigated by abstracting cloud provider APIs behind internal SDKs. |
| Enforcement | Infrastructure defined exclusively as code (Terraform/Pulumi). No manually provisioned resources permitted in production. IaC policies enforced via CI/CD pipeline gates. |
2 — Zero-Trust Security
2 — Zero-Trust Security
Streaming platforms are high-value targets for piracy, credential stuffing, and content scraping. Every service-to-service call, every user request, and every admin action must be authenticated and authorised independently — implicit network trust is never assumed.
| Attribute | Detail |
|---|---|
| Tradeoffs | Token validation overhead adds latency. Mitigated by short-lived JWT caching at the API gateway and mTLS between internal services. |
| Enforcement | All API routes require Bearer token validation at the gateway. Istio/Linkerd service mesh enforces mTLS for east-west traffic. No service exposes an unauthenticated internal endpoint. |
3 — Event-Driven Architecture
3 — Event-Driven Architecture
Media processing, moderation, ML training, and storage tiering are long-running, non-deterministic workflows that cannot be bound to synchronous HTTP request cycles. Event streaming decouples producers from consumers, enabling independent scaling and failure isolation.
| Attribute | Detail |
|---|---|
| Tradeoffs | Eventual consistency introduces complexity in downstream reads. Compensating transactions required for failure handling. Developer cognitive overhead is higher than simple REST. |
| Enforcement | All stateful state transitions (upload complete, transcoding done, moderation decision) must emit domain events to Kafka. Synchronous calls are permitted only for real-time user-facing queries. |
4 — Idempotent Workflows
4 — Idempotent Workflows
Distributed systems guarantee at-least-once delivery. Transcoding workers, billing processors, and moderation pipelines will receive duplicate messages during retries and failovers. Non-idempotent operations would cause double charges, duplicate transcodes, or duplicate content takedowns.
| Attribute | Detail |
|---|---|
| Tradeoffs | Idempotency keys consume additional storage. Deduplication logic adds latency to message consumption. |
| Enforcement | Every async worker checks an idempotency key (contentId + operationType + timestamp) before executing. Idempotency store: Redis with 24-hour TTL. |
5 — Multi-Region Readiness
5 — Multi-Region Readiness
v1 targets Nigeria and the international diaspora. Retrofitting multi-region support onto a monolithic data model post-launch is prohibitively expensive — the architecture must assume multi-region from day one even if initial deployment is single-region.
| Attribute | Detail |
|---|---|
| Tradeoffs | Cross-region replication adds cost and consistency complexity. Active-active conflicts require CRDT or last-write-wins resolution. |
| Enforcement | Data models must document replication strategy at design time. No service may assume a single-region data store. Nigeria residency data is explicitly excluded from cross-region replication by policy. |
6 — Data Sovereignty as a First-Class Constraint
6 — Data Sovereignty as a First-Class Constraint
Nigeria Data Residency is a billable product feature and a potential regulatory obligation. Application-layer-only enforcement is bypassable by misconfiguration — residency must be enforced at the infrastructure layer.
| Attribute | Detail |
|---|---|
| Tradeoffs | Residency-isolated storage cannot leverage global CDN edge caching. Residency content is served from a Nigeria origin only. |
| Enforcement | Object storage bucket policy (deny replication outside Nigeria region) enforced by cloud IAM. The Residency Policy Engine maintains its own audit log. Residency decisions are immutable post-upload. |
7 — Automated Storage Tiering
7 — Automated Storage Tiering
At 8+ PB of content, manual storage management is operationally infeasible. Hot storage costs 5–10× cold storage per GB. Automating tier transitions based on view frequency directly reduces operating costs without degrading viewer experience.
| Attribute | Detail |
|---|---|
| Tradeoffs | Cold storage retrieval latency (seconds to minutes for archive tiers). Mitigated by pre-warming content when trending detection signals an upcoming spike. |
| Enforcement | Tiering engine runs on a scheduled cadence against configurable view-frequency thresholds. No human intervention permitted for routine transitions. Tier state is tracked in the metadata service — not inferred from storage location. |
8 — ML-First Personalisation
8 — ML-First Personalisation
Engagement and retention are directly correlated with recommendation quality at this scale. A rule-based recommendation system is insufficient — ML models must be the primary mechanism for feed generation, related content, and notification targeting.
| Attribute | Detail |
|---|---|
| Tradeoffs | ML pipelines introduce training lag. Mitigated by real-time feature updates and a hybrid ranking approach (ML + recency signal). |
| Enforcement | The home feed API must not return results from a static rule engine without ML ranking. A/B testing framework gates all model deployments. Model performance metrics are dashboard-visible. |
9 — Separation of Control Plane and Data Plane
9 — Separation of Control Plane and Data Plane
Admin operations must not share infrastructure with the viewer streaming data plane. A misconfigured admin operation must never impact playback availability.
| Attribute | Detail |
|---|---|
| Tradeoffs | Increases service count and operational surface area; requires separate deployment pipelines. |
| Enforcement | Admin dashboard, moderation tools, and platform management APIs deploy to a separate cluster (control plane). Cross-plane calls are made via well-defined, rate-limited internal APIs only. |
10 — Observability by Default
10 — Observability by Default
Streaming platform failures — buffering spikes, transcoding backlogs, DRM license failures — must be detected within seconds, not discovered via user complaints. Observability must be built into every service from inception.
| Attribute | Detail |
|---|---|
| Tradeoffs | Telemetry data volume at 50,000 RPS is substantial. Sampling and aggregation strategies required to manage cost. |
| Enforcement | Every service must emit structured logs, RED metrics (Rate, Errors, Duration), and distributed traces. Services without telemetry fail deployment pipeline health checks. SLOs defined per service with automated alerting thresholds. |
11 — Graceful Degradation
11 — Graceful Degradation
When dependent services fail (recommendation engine, moderation pipeline, payment service), the core playback experience must remain functional. Users must be able to watch content during partial system failures.
| Attribute | Detail |
|---|---|
| Tradeoffs | Degraded mode requires fallback logic in every client-facing service, increasing code complexity. |
| Enforcement | Circuit breakers on all inter-service calls. Recommendation service falls back to trending content on failure. Payment service failures return a graceful error and do not block streaming. Feature flags enable runtime degradation control. |
12 — Audit Completeness
12 — Audit Completeness
MCSP handles user data, financial transactions, content rights, and regulatory residency requirements. Every privileged action must be auditable for SOC 2, GDPR, NDPR, and internal governance.
| Attribute | Detail |
|---|---|
| Tradeoffs | Immutable audit logs grow indefinitely; partitioned and tiered to cold storage after 90 days. |
| Enforcement | All admin actions, moderation decisions, residency changes, and financial transactions write to an append-only audit log. The audit log store is write-only for all application principals — no DELETE or UPDATE is permitted. |