HFST Morphological Transducers
We are building the first all-Bantu-language HFST system — compiling BantuNomics cartridges into Helsinki Finite-State Transducers for offline, O(n) morphological analysis and generation across 664 languages. No API required.
What is HFST?
HFST (Helsinki Finite-State Technology) is an open-source framework for building finite-state transducers — mathematical models that encode morphological rules as state machines.
BantuNomics is building the first comprehensive all-Bantu-language HFST system, compiling our 664 language cartridges into standalone binary transducers that can analyze or generate any valid verb form in O(n) time — linear in the length of the input.
These transducers are bidirectional: given a surface form, they decompose it into morphemes; given a morpheme specification, they produce the correct surface form. No other HFST project covers the Bantu family at this scale.
Why HFST Matters for Production
No API calls, no network latency. The transducer runs locally on any machine — mobile devices, edge servers, or air-gapped environments.
Analysis time is linear in the length of the input word. Process millions of forms per second on commodity hardware.
The transducer encodes the exact same rules as the BTS engine. If the cartridge is correct, the transducer output is guaranteed correct.
Available Languages
HFST transducers are compiled for languages at Tier 2 (Generation Active) and above.
Additional languages compiled as cartridges reach T2. Enterprise clients can request priority compilation for specific languages.
Access Model
Use HFST analysis and generation via the BantuNomics REST API or MCP tools. Available on Startup and Enterprise tiers.
Enterprise clients receive compiled transducer binaries for local deployment. No API dependency. Full offline capability.
See Enterprise tier →