
Public Safety Video Analytics at City Scale with Elastic GPU Resources
A public safety case study using pooled GPU resources to reduce response latency and improve utilization across city-wide video systems.
Customer Profile
某市公安局 operates a city-wide video analytics system supporting real-time alerts, case review, and cross-district investigations. Data must remain within jurisdiction, and compute must move to data.
The Business Problem
- 8,000+ camera feeds created unpredictable inference demand.
- District silos caused uneven GPU resource distribution: some idle, some overloaded.
- Latency spikes (5–7s) during major events reduced operational effectiveness.
Baseline metrics:
| Metric | Baseline |
|---|---|
| P95 alert latency | 5–7s |
| GPU utilization | 22–30% |
| Case review queue time | 20–30 min |
| Annual GPU cost | 100% (baseline) |
TensorFusion Solution
- Cross-district GPU pooling without moving sensitive data.
- Pipeline-parallel inference to form virtual large GPUs from idle nodes.
- Graceful exit when local requests arrive to maintain priority.
- Policy-based scheduling for emergencies and large events.



