Urgent NLP & Automation Engineer for PoC

Замовник: AI | Опубліковано: 22.11.2025
Бюджет: 250 $

I am seeking an experienced NLP and automation engineer to deliver a working PoC and short-term solution to map noisy, non-normalized vendor/product strings to canonical NIST CPE identifiers within the next 9 hours. This is a time-sensitive task and I need someone who can move fast, produce reproducible code, and demonstrate results immediately. Scope: ingest a list of raw vendor/product names, preprocess and normalize strings (lowercasing, locale suffix removal, token cleanup), apply alias resolution and rule-based normalization, generate embeddings for both inputs and NIST CPE titles using a lightweight transformer (DistilBERT or similar) exported to ONNX, compute similarity (cosine) between inputs and CPE candidates, combine embedding similarity with classical fuzzy scores (Levenshtein / Jaro-Winkler) via a ranking layer, and output top N CPE matches with confidence scores and simple heuristics for version handling and multi-candidate resolution. Deliverables for PoC (due within 9 hours): 1) a runnable Python prototype that loads a small sample of NVD/CPE metadata and your provided raw list, 2) an ONNX-based embedding inference example and a short PowerShell call snippet demonstrating how to invoke ONNX scoring from CPEMatcher.ps1, 3) a README explaining preprocessing rules, alias dictionary examples, scoring formula, and how to refresh NVD data, and 4) a short evaluation CSV with example inputs, predicted CPE(s) and confidence. Mandatory qualifications: strong experience with BERT-style embeddings, ONNX export and inference, fuzzy string matching, familiarity with NIST NVD/CPE formats, and PowerShell integration experience. Please include a one-paragraph summary of a similar project you delivered, the core technical challenge you faced and how you solved it, estimated time to complete this 9-hour PoC, and any assumptions you will make. Attachments and reference materials are provided here for your use /mnt/data/A_mind_map_in_digital_format_titled_"Synthetic,_Ne.png I will prioritize applicants who can start immediately and who provide a short plan plus a commitment to deliver the PoC within the deadline.