I want a single, turnkey application that watches my CCTV feeds, spots shop-lifters in real time, recognises grocery products on the shelves, and keeps a live head-count of customers. The core model must be YOLO, and I need the exact same code-base to compile and run on both Windows (desktop with NVIDIA GPU) and a Raspberry Pi 4. Video sources vary—some cameras stream RTSP over IP while a few older analog units reach the NVR through a capture card—so the program has to accept either type without manual re-configuration. For product recognition I care only about groceries; no clothing or electronics labelling is necessary. The model should be trained (or fine-tuned) on the most common supermarket items so false positives stay low even when shelves are crowded. Key expectations • Real-time theft detection logic that raises an event or REST webhook the moment a suspicious removal is spotted • On-screen bounding boxes and confidence scores for detected grocery items and customers • Continuous customer counter with hourly CSV/JSON export • Installers or scripts for Windows 10/11 and Raspberry Pi OS, including all required Python, OpenCV, PyTorch/ONNX, CUDA (where available) dependencies • A simple dashboard that shows live feed thumbnails, current customer count, and the last N theft alerts • Clear instructions on adding new grocery SKUs later Acceptance will be based on: 1. Smooth 25-30 fps inference on 1080p streams under Windows with GPU, and ≥10 fps on Raspberry Pi using CPU or a USB accelerator. 2. ≤5 % false-alarm rate on a provided six-hour test set. 3. Clean, well-commented code plus a README that lets me re-train the YOLO model from scratch. If parts of the pipeline need specialised libraries (TensorRT, Coral TPU, etc.), feel free to propose them as optional accelerators.