Tarannum Voice Training & Automated Mimic Scoring System (Web-Based) 1. Project Overview We require a web-based training system designed for Quranic Tarannum learning. The platform allows learners to: 1. Listen to a pre-recorded reference recitation 2. Record their own recitation using a microphone 3. View waveform visualization for both audios 4. Re-attempt multiple times to mimic the reference recitation 5. Receive automated similarity scoring between the reference recitation and the student’s voice 6. Track personal improvement over multiple attempts 7. Allow multiple training centres to use the same platform with individual login accounts The system will run in a classroom environment with ±20 computers, each equipped with headphones + microphone. 2. Core Functional Requirements 2.1 Voice Recording Module • Record audio using the browser/device microphone • Save audio in standard format (WAV or MP3) • Allow multiple recordings per user • Basic trimming or normalization (optional but useful) • Save audio files to local storage or cloud storage (preferred: AWS S3) 2.2 Reference Recitation Playback • Play reference audio provided by teacher/system • Waveform visualization during playback • Playback controls: o Play / Pause / Stop / Repeat o Scrubbing / timeline navigation • Suggested waveform technology: o Wavesurfer.js or Web Audio API 2.3 Student Mimic Recording • User records their own recitation to imitate the reference • Waveform visualization for student audio • Multi-attempt recording allowed • User may compare visually and audibly 2.4 Automated Voice Similarity Scoring The system will calculate how closely the student’s recitation matches the reference recitation. Minimum Scoring Logic • Extract MFCC features for both audios • Apply time-alignment (Dynamic Time Warping / DTW) • Generate similarity index or score (0–100%) Scoring Output • Display percentage match or similarity value • Optional: o Per-segment accuracy o Highlight mismatched sections Recommended Technology • Python + Librosa (audio analysis) • NumPy / SciPy (mathematical processing) • Scoring engine exposed via API (FastAPI / Flask / Node bridge) 3. Advanced Features (Highly Valuable) A. Dual Waveform Overlay • Show both waveforms simultaneously: o Reference audio o Student audio • Used for: o Pitch alignment o Tempo comparison o Visual correction B. Error Highlighting • The system highlights waveform regions with low similarity • Colour-coded segments: green (accurate), yellow (moderate), red (poor) • Helps students identify which parts need correction C. Progress Tracking • Store scores for each attempt • Track improvement over time • Generate learning history per student D. Adaptive Learning • If score on a specific segment is low: o System should isolate/repeat only that segment o Not the entire recitation • Enables faster mastery 4. User & Access Management (IMPORTANT) 4.1 User Roles • Super Admin (System Owner) • Centre Admin • Trainer/Teacher • Student 4.2 Multi-Centre Support The system must support multiple training centres, each operating independently: • Each centre can: o Create its own teachers & students o Manage its own data and recordings • Super Admin can: o Create/Delete training centres o Monitor overall usage 4.3 User Login • Each user has: o Username/Email o Password (stored securely, hashed) o Role + Centre ID 4.4 Licensing & User Quota • SUPER ADMIN can set a maximum user quota per training centre (e.g., 20 users, 50 users, 100 users, etc.) • When the quota is reached: o Centre Admin cannot add more users o System shows “Quota exceeded” This allows future monetisation: Charge per user / per centre / per seat 5. System Architecture Frontend • React / Angular / Vue (preferred) • Waveform rendering: Wavesurfer.js or equivalent • Recording: WebRTC / MediaRecorder API • Fully browser-based (no installation) Backend • Python or Node.js • API routing, user management, scoring engine coordination • Python scoring module recommended Database • MySQL / PostgreSQL / MongoDB • Store: o User accounts o Centre accounts o Audio metadata o Score history Audio Storage • AWS S3 or local server • Metadata in DB, audio in storage bucket Deployment • Browser-based access • Recommended: o AWS EC2 / Lightsail / Elastic Beanstalk o HTTPS security o Scalable architecture for multiple centres 6. Performance Requirements • Recording latency < 300ms • Waveform rendering must be smooth • Scoring time per analysis: o Target < 5 seconds for typical recitation length • Must support ~20 concurrent users (minimum MVP) 7. Security • HTTPS • Password hashing (no plaintext) • Audio storage protected (not publicly accessible) • Role-based access control 8. Deliverables Programmer must deliver: 1. Fully functional MVP or production system 2. Frontend UI with recording, playback, waveform, scoring 3. Multi-user login 4. Multi-centre architecture 5. User quota control per centre 6. Basic dashboard for: o Student o Trainer o Super Admin 7. Documentation: o Setup o Deployment o API specs 8. Source code + IP ownership 9. Additional Future Enhancements (Optional, Not Included Unless Agreed) • Tajwid or makhraj accuracy engine • Mobile app versions • Online billing/payment • Advanced machine learning scoring • Teacher analytics dashboard • Real-time analysis