AI-driven Self-Service Troubleshooting Assistant (Video + Docs Retrieval)

Замовник: AI | Опубліковано: 24.10.2025
Бюджет: 750 $

Summary We want to build an AI-powered support assistant for our customers. A user should be able to describe an issue in natural language (for example: “my routing API returns 403 in staging”), and the system should immediately do two things: Recommend the most relevant internal troubleshooting video clip (recorded by our solution engineers / evangelists). Provide the exact KB article / developer doc / configuration steps that solve the problem. Think of it like: “type a question -> get the right 2-minute explainer video + the matching fix instructions.” We are looking for someone to design the technical approach, select the stack, and deliver an initial working prototype. Scope / What we need built Video ingestion + transcription Take our existing training / troubleshooting videos (screen recordings, demo walkthroughs, internal enablement material). Automatically generate high quality transcripts for each video. Split videos into chapters / segments with timestamps and titles. Example: “Auth setup (00:00–02:15)”, “Common 403 errors (02:15–04:10)”, etc. Store: transcript text, timestamp ranges, video file reference, and tags (product, feature, error code, version, etc.). Knowledge base ingestion Ingest our existing tech content: Knowledge base articles Developer docs / API docs Internal troubleshooting playbooks / runbooks Convert them to chunks that can be searched semantically (vector search / RAG style). Keep the source link for each chunk so we can show “Read full article”. Retrieval / AI question answering User enters a free-text question. System should: a) Find the most relevant video segment(s) and return: title, summary of what’s covered, and timestamp to jump to. b) Surface the most relevant KB / doc section with actual steps. The assistant should not hallucinate fixes. It should only answer using content we ingested. Output format to the user: “Watch this: [Video Title - starts at 02:15]” “Do this next: [Step-by-step fix from KB]” Optional: “Related docs: [link]” Admin / content pipeline basics Ability for us (non-engineers) to upload a new video and have it auto-processed: transcribed segmented indexed for search Ability to re-index docs when we update them. High-level integration path We eventually want to expose this as: A web widget inside our Support Portal / Developer Portal (JS SDK style embed) A “Help me fix this” experience inside our product UI (context-aware suggestions, based on error codes) Deliverables for phase 1 (what you’ll actually ship) Architecture + tech stack proposal A short spec that answers: Which LLM(s) you propose for semantic search / retrieval / summarization (OpenAI, local model, etc.). How you’ll do vector storage and retrieval (e.g. Pinecone, Weaviate, Elasticsearch w/ dense vectors, pgvector/Postgres). How we process and chunk videos (which transcription service, how we generate chapters). How the API surface will look for our frontends (input: free text, output: JSON with “video_reference + timestamp + doc_snippet”). We are a HERE Technologies style environment, so bonus if you can talk about clean API boundaries / SDK packaging so this can ship later as: Internal API endpoint (REST/gRPC) Lightweight JS SDK for UI teams Optional mobile SDK (Android/iOS) wrapper later Working prototype Simple web UI (can be barebones React) where I can: Upload one video file + one KB article. Ask a question. Get back: top video segment + matching doc excerpt. Show response with: Video title and timestamp Short summary of why this clip was chosen Snippet of the fix steps This does not need full production security/hardening yet. But it needs to actually run end-to-end. API contract Document the API you build: POST /assist/query Body: { "question": "user text here" } Response: { "video": { "title": "...", "videoUrl": "...", "startTimeSeconds": 135, "reason": "This segment explains how to solve the 403 auth issue in staging using API key rotation" }, "suggestedFix": { "summary": "Rotate the API key in Console, then update env var ROUTING_API_KEY in staging", "sourceDocTitle": "...", "sourceDocUrl": "..." }, "relatedDocs": [...] } Include how we’d embed this in a portal (example usage snippet for JS). Must-have skills We are not looking for someone who only “calls an LLM API.” We need someone who has actually built retrieval-style assistants. Concretely, you should have experience with: Speech-to-text at scale (Whisper, Deepgram, AWS Transcribe, etc.). Automatic chaptering / segmentation of long videos and attaching metadata. Retrieval-Augmented Generation (RAG) pipelines: chunking embeddings similarity search grounding answers in source docs Vector databases or embedding search infra. Building clean backend services (ideally in Node.js / TypeScript, Python, or Java/Kotlin). Shipping an API that frontend apps can consume. Optional but nice: React frontends + basic player with timestamp deep links. Bonus if you’ve done enterprise support tooling / developer portal integrations before. Tech stack (open to your proposal, but this is the direction) Below is our expected baseline. If you want to suggest something else, justify it with tradeoffs. Transcription Whisper large-v3 or equivalent accuracy model for English technical speech (runs locally or via API). Store transcript as structured JSON: [ { "start": 0.0, "end": 12.5, "text": "In this section we’ll configure the API key for the Routing service…" }, ... ] Video segmentation Either: Rule-based (split by silence + slide/scene changes + topic shifts), or LLM-generated chapter summaries on top of transcript. Output should include labels like "Setup", "Authentication error handling", "Rate limit troubleshooting", etc. Embedding / search Generate embeddings for: each transcript segment each doc chunk Store embeddings in Pinecone / Weaviate / pgvector. Include metadata: { product:"Routing API", errorCode:"403", env:"staging", version:"v8", ... } LLM layer Rerank top N candidates and build the final answer. Summarize the chosen video segment into 2-3 sentences of “why you should watch this”. Pull the exact troubleshooting steps from the KB snippet (don’t invent fixes). Service / API Backend service that exposes: /ingest/video /ingest/doc /assist/query This should be containerizable and deployable (Docker). Auth can be simple token for now. Frontend (prototype level) Basic React page with: Upload form for admins Search box for user Result card that shows: videoTitle + “Play from 02:15” summary of what they’ll see Copy/paste steps from doc link: “Read full article” Player should jump directly to the suggested timestamp. What success looks like After phase 1: I can upload a new troubleshooting video from my evangelist team. The system converts it, indexes it, and understands what it’s about. A customer can type “Why am I getting 403 from Geocoding API?” and get: “Watch this 1m30s clip from ‘Fixing 403 Authentication Errors’, starting at 02:15” “Then follow these 3 steps…” “Full doc: [link]” Phase 2 (not in scope now, but good if you’ve done it before) Multi-language support. Access control / entitlement (only show content for products the customer is licensed for). Analytics: what are people asking, what content we don’t have yet. How to apply When you reply, include: A short description of the most similar system you’ve built (RAG + docs + media search). Your preferred stack for: transcription embeddings / vector DB backend service language A rough timeline for delivering the phase 1 prototype. Ballpark total cost for phase 1 (fixed bid is fine). If you cannot provide these four things, please don’t apply