Multimodal Safety Forecast ML Model

I have safety sector time-series dataset that combines three synchronized streams: sensor imagery, textual maintenance logs, and high-frequency numeric readings. The objective is to forecast future values—not merely detect anomalies—so grid operators can anticipate demand, equipment stress, and renewable supply fluctuations. Because this is a research-level effort, I’m not looking for an off-the-shelf CNN, RNN, or simple transformer stack. I need a genuinely novel architecture (or a rigorously justified adaptation of cutting-edge multimodal papers) that fuses image, text, and numeric signals into a single forecasting pipeline and demonstrably outperforms strong baselines. Key expectations • End-to-end experimentation code (Python, PyTorch or TensorFlow) with clear data loaders for each modality • Custom model implementation with commented rationale for design decisions • Reproducible training scripts, hyper-parameter configs, and a validation notebook that plots forecast accuracy against standard baselines • Final technical report summarizing methodology, results, and potential publication avenues Acceptance criteria • Forecast MAE or MAPE improvement over baseline multimodal fusion of at least X% on my held-out test set (exact target set during kickoff) • Ablation study proving the contribution of each modality • Clean, runnable repository with README and environment file If you thrive on research challenges and can back ideas with solid code and metrics, let’s push multimodal forecasting forward together.

Python

Регистрация