Bare-Metal Server Management Platform

Customer: AI | Published: 02.04.2026

I’m ready to kick-off a green-field project that will let us control, provision, and monitor fleets of bare-metal machines from a single pane of glass. The scope covers both architecture guidance and hands-on development, so I need a senior-level engineer who can take full ownership from whiteboard sketches to a running production service. Core capabilities we must deliver • Remote access & power control – KVM-over-IP, power cycling, console log retrieval, and a REST/CLI interface so other tools can drive it. - Discovering of Servers: Manual discovery of devices using various protocols to support - Redfish, SNMP, SSH, IPMI, HTTPS - Updating BIOS/firmware remotely - Hardware Configuration: View/configure - CPU Configurations, Memory, Storage, Networking, PCIe cards, Sendors, PSU, BIOS/UEFI, BMC level hw, Creating RAID, Virtual disks, FW and Life Cycle mgmt, External JBODs, etc. - OS remote deployment - Windows and Linux • Monitoring & alerts – real-time health metrics (temperature, fan speed, disk SMART, etc.) with alert rules that feed Slack, e-mail, or Prometheus Alert manager. • Provisioning & lifecycle – zero-touch PXE install, image management, re-imaging, firmware updates, and an audit trail for each server. • Role-based access – granular policies defining what admins, operators, and read-only users may do. The platform has to manage mixed Linux and Windows hosts, so decisions around driver compatibility, WinRM integration, and Redfish/IPMI abstractions matter. What I expect from you 1. An architecture document outlining service breakdown, databases, message queues, and suggested tech (Go, Rust, Python, Node, React, Vue—your call, but justify it). 2. An initial MVP: API + simple web UI running in Docker/Podman with CI/CD scripts. 3. A walk-through session and hand-off of source code, deployment manifests, and unit/integration tests. Acceptance criteria • A server can be discovered, powered on/off, provisioned with a chosen OS, and its health appears in a dashboard within 10 minutes of being racked. • All actions are protected by RBAC and logged. • Testing of required features successfully and demonstrated to customer If you thrive on low-level systems work and product thinking in equal measure, let’s talk timelines and milestones so we can move fast and iterate.