I will provide a corpus of raw call recordings, each in MP3 format, and I need a machine-learning model that can automatically flag fraudulent activity. The model must correctly recognise the three problem categories—Phishing, Robocalls and Telemarketing scams—without human intervention. What I expect you to handle: • Pre-processing: clean the audio and extract features (e.g., MFCCs or spectrograms) that capture speaker and content cues. • Modelling: design, train and fine-tune a classifier; CNN, RNN, Transformer or a hybrid approach is acceptable if it improves accuracy. • Evaluation: deliver precision, recall, F1 and a full confusion matrix for each fraud type so I can judge real-world performance. • Deployment assets: an inference script or small REST service that accepts an MP3 file and returns the predicted class with a confidence score, plus all model weights and code (Python with TensorFlow or PyTorch preferred). Please outline any similar speech analytics projects you have completed and the toolkit you would like to use. Once we agree on architecture and milestones, I can release the audio so you can get started right away.