I’d like a straightforward, offline Retrieval-Augmented Generation prototype that lets me ask natural-language questions about two monitoring data sources—license-usage records and standard system-metric snapshots—and receive concise text reports in response. Core goals • Ingest small sample sets of license-usage tables and basic CPU/RAM/disk metrics. • Use an on-device Llama 3.1 (or comparable non-online LLM) with a lightweight vector store to retrieve relevant passages. • Support ad-hoc querying, brief analytical insights, and auto-generated summaries, all returned as plain text. Scope for this first phase is intentionally lean: a runnable script or notebook, minimal UI (CLI is fine), and clear setup instructions so I can point the code at new CSV/JSON logs and regenerate the embeddings. If the prototype answers typical questions such as “Which feature keys were most consumed yesterday?” or “Summarize today’s CPU spikes,” and outputs a short, readable report, the job is complete. Please keep dependencies open-source, comment the code, and note any hardware assumptions. Future extensions (dashboards, visualizations, additional data types) are possible, but for now I only need this basic blueprint.