stack
Computer Vision Stack
FastAPI + PyTorch + an analytics dashboard for video and image pipelines
The Computer Vision Stack is the foundation under AppLiaison products that ingest video or image streams and turn them into business signals — retail foot-traffic counts, dwell-time heatmaps, defect detection on assembly lines, perimeter monitoring. It pairs Python where the math lives (FastAPI + PyTorch) with a Next.js dashboard where the operators live, and supports both cloud and on-device inference so latency-sensitive workloads do not have to round-trip frames over the internet.
Architecture
When to choose this stack
- The product input is video or images, not events from a database
- Latency requirements drive on-device inference at the edge
- The operator surface is a live monitoring dashboard, not a CRUD app
- Customers integrate with existing camera infrastructure, not buy new
- The output is business analytics (dwell time, throughput) rather than a chat
What's NOT included
- — Custom training of customer-private models (available as a paid engagement)
- — Camera hardware sourcing or installation
- — Long-term video archival beyond 90 days (S3-IA bucket lifecycle, customer-paid)
- — Real-time human review of every event — the system is autonomous by design
How the pieces fit
Video frames flow from RTSP-capable cameras into a FastAPI ingest service. Inference runs in one of two places — on a GPU pool in the cloud for cameras whose latency budget can absorb a 200ms round-trip, or on an edge runtime (NVIDIA Jetson, an Intel NUC with OpenVINO, or an Apple Mac mini with CoreML) for cameras that can’t.
Detected events post to Postgres and to a Redis channel; the dashboard subscribes via WebSocket, so a count or alert appears within a second of the event happening. Hourly and daily rollups precompute the analytics most customers actually look at.
Why these choices
FastAPI + PyTorch over a Node-only stack: the model ecosystem is in Python. Pretending otherwise just adds round-trip latency.
ONNX export for on-device inference: lets the same trained model run in the cloud (PyTorch) and on the edge (CoreML / TFLite / OpenVINO) without retraining.
Two inference locations instead of one: most workloads are happy with cloud inference; the ones that aren’t are deal-breakers. Supporting both at the platform level means we don’t lose the deal over latency.