AI Detector for Turkish Text

Бюджет: 1500 $

Data: I will provide two curated Turkish corpora in plain text: one human-written, one LLM-generated. Goal: Build a transformer-based binary classifier that predicts whether any Turkish passage is AI-generated or human-written with high accuracy. It needs to be evaluated sentence by sentence. Scope • Use Transformers (e.g., BERT, RoBERTa, DeBERTa or a custom Turkish-specific variant) or alternative. • Clean, split and balance the provided datasets, then fine-tune or train the model end-to-end. • Optimise for high F1 and accuracy; include validation metrics and a brief error analysis. • Provide moderate explainability: probability scores plus one or two key attention/feature insights per prediction—enough for me to understand why the model leans human or AI without a full research paper. Note: The method is not important, success is important. Deliverables 1. Trained model weights ready for inference. 2. Complete, well-commented source code (Python preferred) covering preprocessing, training, evaluation and inference. 3. A short README explaining environment setup, how to reproduce results, and how to call the explainability function. I will run your code on my own machine to verify the metrics, so please keep all dependencies explicit and reproducible (requirements.txt or environment.yml). If you enjoy exploring Turkish NLP and can squeeze the best out of transformers, I’d love to review your proposal.

Python

Регистрация