I need a compact electron vite Python-based pipeline that can watch hours of raw footage and automatically carve it into clean rushes. The idea is simple: detect every meaningful scene change or pronounced camera movement, mark the in- and out-points, and hand back an ordered timeline I can drop straight into my editing suite. My tool stack is already defined: PyTorch Video for the deep-learning backbone, PyDetect for proven vision utilities, and OpenCV for the faster classical operations. If you prefer to wrap parts of the workflow in ffmpeg or similar, that’s fine as long as the core logic stays in Python and can be called through a clear API endpoint. Scope • Analyse full-resolution clips, spot scene boundaries and camera pans/tilts/zooms with high recall. • Spit out a JSON (or EDL if you’d rather) listing start/end frames plus an H.264 proxy of each shot so I can skim results quickly. • Keep the code modular: one module for shot detection, one for camera-motion cues, one for assembly/export. • Optimise inference so a 30-minute clip processes in roughly real-time on a modern GPU. Acceptance I’ll pass you a sample reel; if the script returns shot lists that match at least 95 % of my manual cuts and labels major camera moves, we’re good. Deliver the repository, a tidy requirements.txt, and a one-page README that shows a curl / Python call to trigger the workflow. I’m on a tight schedule, so clear, frequent commits and the ability to iterate quickly will be key. Let me know your availability and how soon you can start.