We need a developer to build a fully offline AI companion pipeline that integrates directly with Unity. The system must include a Python function that uses Qwen2.5-VL to generate a clean, one-sentence caption from a base64-encoded screenshot using the correct Hugging Face multimodal workflow. A local RAG component (FAISS or Chroma) should preload our text documents, embed them locally, and retrieve the most relevant chunks using both the scene caption and the player’s question. A final response generator must then combine the caption, the retrieved RAG context, and the player’s query to produce a concise, grounded, one-sentence answer from the AI companion. On the Unity side, we need InputActions for the Q key, screenshot capture and base64 encoding, a UnityWebRequest POST to the local server, and UI logic to display the returned answer. The Python server (Flask or FastAPI) must run entirely offline and handle the full pipeline end-to-end. All components must be fully integrated and functional inside Unity to receive full payment.