Apertium is a machine AI translation engine built mostly on linguistic rules. It was born to help translate between related languages or low-resource pairs.
Its code and language data are open for anyone to see, edit, and deploy. It’s not a neural-based giant, but it fills unique gaps.
It works via modular steps like analysis, transfer, generation. Language developers write rules and dictionaries. The system chains modules to convert source to target.
Over time, Apertium has been hybridised in parts combining statistical modules to improve some steps.
It supports many language pairs (around 45+ or more, with ongoing expansion).
Text goes through morphological analysis, disambiguation, structural transfer, lexical selection, and generation. Each stage is rule-driven.
Some modules incorporate statistical techniques, e.g. learning lexical preferences or handling multi-word expressions.
You can add or edit rules, dictionaries, or modules for new different languages or domains.
You can run your own instance or embed it. No need to rely on external APIs.
Many contributors or language communities maintain language data, rules, or improvements.
Compared to heavy neural networks, it’s more predictable especially for smaller language sets.
Apertium is free (open source under GPL). You don’t pay license fees.
If you deploy it yourself, your costs are hosting, computing power, or maintenance.
If an organization offers Apertium as a service (hosting, API, support), they may charge but the core software itself is free.
Some institutions bundle support, training or custom rule development as paid services.
I couldn’t find a reliable statistic for unique 80,798 monthly visitors for the official Apertium site.
However, for its usage via Softcatalà-hosted Apertium translators, in 2020 there were ~4.6 million translation requests/month for Spanish → Catalan and ~1.1 million for Catalan → Spanish.
So usage is decently high in niche communities.
Because it is modular, you can replace or tweak submodules as needed.