Human Validation at Scale

Our Words

Amina is the community-driven recording platform that transforms native speakers into data scientists. Every recording is validated against the engine's predictions, creating the world's first speaker-calibrated Bantu language corpus.

What is Amina?

Amina (amina.ai) is a Progressive Web App designed for native speakers to record structured language data from their phones — anywhere, anytime.

Unlike crowd-sourcing platforms that collect raw, unstructured audio, Amina guides speakers through a precise recording pipeline where every utterance maps to a specific linguistic prediction from the BTS engine.

The result is not just audio — it is acoustically validated, speaker-attributed, linguistically decomposed training data.

Platform Specs

Audio format48kHz mono WAV

PlatformPWA (mobile-first)

Countries25+

Languages onboarded21

How It Works

Record Syllables

Speaker records all 140+ isolated syllables for their language. This builds their unique acoustic fingerprint — the calibration baseline.

Build Speaker Profile

System computes F₀ baselines, onset perturbation factors, and duration constants for this specific speaker's voice.

Record Sentences

Speaker records verb forms and sentences generated by the BTS engine, each carrying embedded tonal predictions.

Validate

Automated pipeline compares acoustic measurements against engine predictions. Pass → certified. Fail → re-prompted with corrective guidance.

Calibrate

Validated recordings feed back into the system, refining speaker profiles and strengthening the acoustic ground truth.

The Speaker Community

25+

Countries represented

Languages onboarded

664

Target languages

Gamified Contribution

🏆

Trophies & badges

⭐

Points & streaks

📊

Progress tracking

💰

Milestone payments

Speaker Profiles

Syllable recordings calibrate speaker-specific acoustic constants — transforming textbook estimates into physical measurements.

F₀ Baselines

Speaker-specific fundamental frequency for each vowel in isolation

Onset Factors

How much each consonant type perturbs the starting pitch for this speaker

Duration Constants

Speaker-calibrated syllable lengths for short, long, and prenasalized vowels

Formant Anchors

F₁/F₂ values that verify vowel identity per speaker

Quality Pipeline

Automated

Acoustic Validation

SNR check, pitch tracking, duration validation, nasal bridge auditing — all automated against engine predictions.

Human

Peer Review

Native speaker reviewers verify naturalness and correctness. Multiple reviewers per recording for critical data.

Corrective

Re-Prompt Queue

Failed recordings generate specific corrective instructions. Speakers re-record with targeted guidance until the thesis passes.

For Speakers

Join as a contributor. Record your language, earn rewards, and help build AI that understands your mother tongue.

Join Amina →

For Enterprise

Access the validated, speaker-attributed audio corpus. Every recording linked to acoustic profiles and linguistic certification.

See Data Access Plans →