Natural language processing in African and Asian languages for sustainable development: The FAIR Forward – Artificial Intelligence for All initiative
“FAIR Forward – Artificial Intelligence for All” aims to improve the conditions for local artificial intelligence (AI) innovation to solve local problems in Rwanda, Uganda, Ghana, South Africa and India. The project is an initiative of the German Development Cooperation implemented by the Deutsche Gesellschaft für Internationale Zusammenarbeit GmbH (GIZ). Besides engaging in local capacity development and supporting the development of AI policies and ethical AI guidelines, FAIR Forward works with its partners to improve the access to training data for natural language processing as well as AI and machine learning models.
© Aurora Images / GIZ
In this area, the FAIR Forward initiative focuses on natural language processing in underrepresented languages. For languages in Africa and Asia as well as local variants of English, this means at first creating voice and text datasets to subsequently build language processing models. These models could be included in information hotlines and improve access to digital services like heath applications where users could vocally name symptoms instead of writing them. The technology has the potential to give millions of people access to services and information that are currently underserved.
In Rwanda, the FAIR Forward initiative supports the local start up Digital Umuganda to coordinate the development of voice interaction technologies in the country’s national language, Kinyarwanda. As a first step, Digital Umuganda collects speech data in Kinyarwanda from volunteers during the state wide, monthly community work day Umuganda. Our partner Mozilla hosts the data openly on its Common Voice platform.
FAIR Forward will scale and adapt this approach to other partner countries including Uganda and India and engages in other means of collecting language data. For example, FAIR Forward will extend an existing online competition to collect more language data and to create NLP models based on this data on the Pan-African data science platform Zindi. For further information on GIZ and their services and projects, please have a look online.