SafeRoomAI – Real-Time Anomaly Detection Platform

Hybrid computer vision pipeline (YOLO + Autoencoder) with reproducible MLOps (DVC + MLflow) and fault‑tolerant streaming.

YOLOv8 Autoencoder DVC MLflow BentoML K8s FastAPI

Problem

Security operations required *stream-level* anomaly detection with resilience to network instability. Traditional single‑model approaches struggled with novel events and produced brittle alerting when feeds dropped.

Solution Overview

Hybrid inference: supervised object/event detection (YOLO) + unsupervised reconstruction error (Autoencoder).
Streaming orchestrator controlling decode → preprocess → dual-path scoring → aggregation.
Fallback mechanism: switch to cached local video segment or circular buffer when RTSP unavailable.
Reproducible, versioned pipeline (code, data, model artifacts).

Architecture Highlights

Modular Stages: Capture, Frame Queue, Detection, Anomaly Scoring, Aggregation, Output Bus.
Asynchronous Queues: Micro-batching frames to keep GPU saturated while bounding latency.
Inference Modes: Primary YOLO events; fallback anomaly score gating reduces false positives.
Packaging: BentoML service exporting REST + JSON schema for detection & anomaly endpoints.
Interfaces: FastAPI edge wrapper adds routing, auth stub & health probes.

Reproducibility & Experimentation

DVC tracks datasets, augmentation configs, and model checkpoints (hash-pinned lineage).
MLflow logs parameters, metrics, artifacts, env fingerprint; run IDs embedded in service metadata.
Deterministic builds: lockfiles + Docker layering; build args capture model/runtime versions.

Security & Quality

Container principle of least privilege (non-root user, minimal base image).
Structured logging (JSON) for ingestion anomalies & inference latency.
Pluggable auth gateway placeholder (JWT / API key) for future RBAC.
Static analysis / lint gates (Ruff, mypy optional) in CI (planned integration).

Performance & Optimization

Quantization & selective layer fusion to reduce inference time.
Frame skip heuristics based on motion / event density thresholds.
Parallel decode (ffmpeg) pipelined with GPU inference to hide I/O latency.
Candidates: p95 latency target < 70ms (replace with audited number), sustained FPS scaling across N streams.

Deployment & Operations

BentoML build → Docker multi-stage image → K8s manifest (readiness & liveness).
Config via env & mounted model store; hot-reload friendly architecture for model swaps.
Planned lightweight metrics exporter (Prometheus / custom JSON) for live dashboard integration.

Timeline (Condensed)

Problem Needed real-time anomaly detection for camera streams with graceful fallback when RTSP feeds become unavailable.
Design Dual path: YOLO for object/event semantics + Autoencoder reconstruction error for unsupervised anomaly scoring.
Reproducibility DVC pinning data/model artifacts; MLflow run metadata & metrics ensure repeatable experiments.
Deployment BentoML packaged service containerized; readiness & liveness probes for K8s orchestration.
Optimization Quantization, micro-batching & frame skip heuristics targeting latency reduction (e.g., ~120ms → ~70ms p95 candidate).

Next Focus: finalize empiric latency benchmarks, introduce auto-scaling policy tuned to queue depth, integrate alert fan‑out channel (webhook / email) and add subscription feed for near real-time anomaly notifications.

View Repository ↗ ← Back to Portfolio