I run an object-detection + depth-estimation pipeline that couples YOLO with MiDaS. The current bottleneck is MiDaS: on outdoor video frames it lags behind real-time requirements. Over the next three days I want to refactor, prune, or re-export the model so its depth map arrives significantly faster without noticeably hurting quality. Scope • Focus squarely on MiDaS; YOLO changes are welcome only if they help the overall throughput. • Target dataset is exclusively outdoor scenes—streets, parks, drone footage—so any tuning should learn the lighting and scale typical to those environments. • Goal: measurable reduction in per-frame inference time on an NVIDIA RTX-class GPU (baseline figures will be shared at project start). Technical context Python 3.10, PyTorch 2.x, CUDA 12 are already in place. I am open to TorchScript, ONNX, TensorRT, half-precision, layer fusion, or lightweight architectural tweaks—whatever cuts latency the most while keeping depth accuracy acceptable. Deliverables (all items required for approval) 1. Optimized MiDaS weight file(s) and any auxiliary scripts. 2. Benchmark notebook or script that reproduces pre- vs post-optimization timings. 3. Brief README explaining installation, expected FPS on a reference GPU, and any trade-offs introduced. All work needs to land within 72 hours of kickoff. Clear, commented code and reproducible results will mark the task complete.