Finite-State Technology

HFST Morphological Transducers

We are building the first all-Bantu-language HFST system — compiling BantuNomics cartridges into Helsinki Finite-State Transducers for offline, O(n) morphological analysis and generation across 664 languages. No API required.

What is HFST?

HFST (Helsinki Finite-State Technology) is an open-source framework for building finite-state transducers — mathematical models that encode morphological rules as state machines.

BantuNomics is building the first comprehensive all-Bantu-language HFST system, compiling our 664 language cartridges into standalone binary transducers that can analyze or generate any valid verb form in O(n) time — linear in the length of the input.

These transducers are bidirectional: given a surface form, they decompose it into morphemes; given a morpheme specification, they produce the correct surface form. No other HFST project covers the Bantu family at this scale.

Analysis Mode
input: balebomba
output: ba[SM.3PL]+le[PROG]+bomb[work]+a[IND]
Generation Mode
input: ba[SM]+le[PROG]+bomb[Root]+a[FV]
output: balebomba

Why HFST Matters for Production

Offline Processing

No API calls, no network latency. The transducer runs locally on any machine — mobile devices, edge servers, or air-gapped environments.

O(n) Performance

Analysis time is linear in the length of the input word. Process millions of forms per second on commodity hardware.

Deterministic Accuracy

The transducer encodes the exact same rules as the BTS engine. If the cartridge is correct, the transducer output is guaranteed correct.

Available Languages

HFST transducers are compiled for languages at Tier 2 (Generation Active) and above.

Bemba
T5 · Production
Nyanja
T2 · Active
Tonga
T2 · Active
Lozi
T2 · Active
Zulu
T2 · Active
Swahili
T2 · Active
Shona
T2 · Active
Kinyarwanda
T2 · Active

Additional languages compiled as cartridges reach T2. Enterprise clients can request priority compilation for specific languages.

Access Model

API Access

Use HFST analysis and generation via the BantuNomics REST API or MCP tools. Available on Startup and Enterprise tiers.

GET /api/v1/hfst/analyze?lang=bem&form=balebomba
Binary Download

Enterprise clients receive compiled transducer binaries for local deployment. No API dependency. Full offline capability.

See Enterprise tier →