Human Validation at Scale

Our Words

Amina is the community-driven recording platform that transforms native speakers into data scientists. Every recording is validated against the engine's predictions, creating the world's first speaker-calibrated Bantu language corpus.

What is Amina?

Amina (amina.ai) is a Progressive Web App designed for native speakers to record structured language data from their phones — anywhere, anytime.

Unlike crowd-sourcing platforms that collect raw, unstructured audio, Amina guides speakers through a precise recording pipeline where every utterance maps to a specific linguistic prediction from the BTS engine.

The result is not just audio — it is acoustically validated, speaker-attributed, linguistically decomposed training data.

Platform Specs

Audio format48kHz mono WAV
PlatformPWA (mobile-first)
Countries25+
Languages onboarded21

How It Works

1

Record Syllables

Speaker records all 140+ isolated syllables for their language. This builds their unique acoustic fingerprint — the calibration baseline.

2

Build Speaker Profile

System computes F₀ baselines, onset perturbation factors, and duration constants for this specific speaker's voice.

3

Record Sentences

Speaker records verb forms and sentences generated by the BTS engine, each carrying embedded tonal predictions.

4

Validate

Automated pipeline compares acoustic measurements against engine predictions. Pass → certified. Fail → re-prompted with corrective guidance.

5

Calibrate

Validated recordings feed back into the system, refining speaker profiles and strengthening the acoustic ground truth.

The Speaker Community

25+
Countries represented
21
Languages onboarded
664
Target languages

Gamified Contribution

🏆
Trophies & badges
Points & streaks
📊
Progress tracking
💰
Milestone payments

Speaker Profiles

Syllable recordings calibrate speaker-specific acoustic constants — transforming textbook estimates into physical measurements.

F₀ Baselines
Speaker-specific fundamental frequency for each vowel in isolation
Onset Factors
How much each consonant type perturbs the starting pitch for this speaker
Duration Constants
Speaker-calibrated syllable lengths for short, long, and prenasalized vowels
Formant Anchors
F₁/F₂ values that verify vowel identity per speaker

Quality Pipeline

Automated

Acoustic Validation

SNR check, pitch tracking, duration validation, nasal bridge auditing — all automated against engine predictions.

Human

Peer Review

Native speaker reviewers verify naturalness and correctness. Multiple reviewers per recording for critical data.

Corrective

Re-Prompt Queue

Failed recordings generate specific corrective instructions. Speakers re-record with targeted guidance until the thesis passes.

For Speakers

Join as a contributor. Record your language, earn rewards, and help build AI that understands your mother tongue.

Join Amina →

For Enterprise

Access the validated, speaker-attributed audio corpus. Every recording linked to acoustic profiles and linguistic certification.

See Data Access Plans →