This project is ideal for Machine Learning Engineers (MLEs) with 5+ years of experience or Machine Learning PhDs interested in reproducible ML research and benchmarking AI model development. Key Responsibilities Compile external ML competitions into challenging tasks that reflect real-world responsibilities (training models, prepping datasets, and running experiments). Draft detailed, executable natural language plans to completing MLE tasks. Implement those plans in Python code within a provided Docker environment. Validate implementations against original plans and mark discrepancies. Ideal Qualifications 5+ years of experience in applied machine learning OR PhD in machine learning or adjacent fields. Strong Python engineering skills, especially for model training and data handling. Familiarity with Docker-based development environments. Detail-oriented approach to technical planning and code validation. Experience with reproducibility and benchmarking in ML research (preferred). Comfortable working independently under strict compliance constraints.