I am in the final stretch of a computer-vision project that converts a single 2-D photograph into a fully navigable 3-D scene. About 89 % of the pipeline—dataset prep, training loop, and post-processing—is already finished. The missing piece is the generative model that actually predicts depth, completes occluded geometry, and exports the result in a standard 3-D file. You will be working on an NVIDIA A6000 cloud instance that I have ready to go, so you can assume plenty of VRAM for large-scale diffusion, NeRF, or implicit surface models. I am framework-agnostic: if your best solution is PyTorch, TensorFlow, or Keras that’s fine, as long as the code is clean and reproducible. Key goals • Accept a single RGB image as input • Infer depth and surface normals, hallucinate hidden geometry, and rebuild a watertight mesh • Texture the mesh using the original image plus any learned in-painting • Output a common interchange format (OBJ, FBX, or glTF—pick whichever integrates most easily) Acceptance criteria 1. A Python script or notebook that runs end-to-end on the A6000 instance. 2. Trained weights or clear training instructions so I can reproduce results. 3. An example image and its generated scene demonstrating free-orbit camera movement with minimal artifacts. If you have prior work with NeRF, Instant-NGP, or diffusion-based 3-D synthesis, please mention it—speed and quality are both important here. Let me know any additional resources you might need and your estimated turnaround time so we can wrap this project up.