Translation Commons has published "Indigenous Languages: Zero to Digital", a guide for creating the digital infrastructure to bring a language online.
A post on the PanLex blog about the steps involved in creating digital language tools for under-supported languages, including the Zero to Digital guide that breaks down in detail exactly what steps are involved. Excerpt:
Many of us who use a well-supported language online may not realize how many layers of technology underpin the implementation of language on our devices. When we buy a new phone or laptop, we barely notice these layers, because our language usually functions and displays seamlessly on the device without hiccup. But that’s not the experience for speakers of the under-resourced languages, even including some with millions of speakers. For example, Lahnda (Pakistan) with 93M speakers and Wu Chinese with 81M are not well supported.
Here are some of the foundational layers that must be implemented behind the scenes for a language to be digitally supported.
- A writing system (such as Latin, Devanagari, or a newly created system) and orthography (spelling rules) must be identified, chosen, or developed for the language. - Each letter or character of the writing system (like capital B, lowercase c, or !) must exist in Unicode, giving each one a standard numeric code point. - A font that displays all letters and characters in the writing system must be developed. - Using typography design software, font designers identify and create all necessary glyphs, or graphic shapes that represent the letters and characters of the writing system. - The font designers write rules to handle any required ligatures and other complex cases in order to combine, stack, or connect the glyphs properly, in accordance with the writing system’s conventions. (See graphic below.) This step can be labor intensive for complex writing systems. - The font must be available on a device at the moment the user needs it. Keyboards must be created so that text can be input in a language. These can be either on-screen keyboards or mappings from common physical keyboard layouts such as QWERTY.
Read the whole post or view the Indigenous Languages: Zero to Digital guide directly.








