AI-automated Long-form YouTube City Guide Generation

Заказчик: AI | Опубликовано: 15.12.2025

We are looking for an experienced developer to build a fully automated system from scratch that generates 30–45 minute long-form YouTube city guide videos using AI and cloud video rendering APIs. The system must be reusable and scalable, where changing only the city name and script outline produces a ready-to-upload YouTube video with no manual editing. This is a serious engineering project involving long-form media automation, not a basic script or a manual editing task. Project Goal Build an end-to-end automation pipeline that: Generates documentary-style long-form scripts Converts scripts into high-quality AI voiceover Creates moving visuals using image-to-video (live-image motion) Renders a single 30–45 minute 1080p MP4 video Requires no manual intervention Can be reused for multiple cities Final output must be a YouTube-ready MP4 file. Required Workflow 1. Input City name Script outline (.docx) Intro and outro audio files (MP3) 2. Script Generation Use OpenAI API Expand outline into approximately 8,000–10,000 words Natural spoken documentary narration No meta text, explanations, or sample language 3. Voiceover Use ElevenLabs API Fixed voice ID Handle long scripts using chunking Output a single narration MP3 4. Visual Generation No stock video search APIs (Unsplash, Pexels, Shutterstock, etc.) No local video rendering (ffmpeg, moviepy) Use Shotstack image-to-video approach with Ken Burns style motion Long clip durations (12–15 seconds) Slow cinematic motion effects only Visuals must feel like moving video, not static slides 5. Video Rendering Use Shotstack REST API Single render only (intro, main content, and outro in one timeline) Correct timing with no overlaps 1080p output Credit-efficient timeline design 6. Asset Storage Use Cloudinary for asset hosting All assets must be public, static, and directly downloadable 7. Automation Requirements Python-based implementation preferred Robust error handling Render status polling Automatic download of final video Easy reuse for 10–20+ cities Required Skills Must have: Python automation experience REST API integration Shotstack or similar cloud video rendering API OpenAI API ElevenLabs API Cloudinary Long-form media pipeline experience Nice to have: Faceless YouTube automation experience Credit and cost optimization Async or background job processing Not Suitable For Manual video editing Canva or CapCut based workflows Short-form video generation Beginner-level scripting Deliverables Clean and documented source code Config-driven setup for easy city changes Setup and deployment instructions Working demo video (30+ minutes) Scalable architecture