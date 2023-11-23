Start-up teaches AI African languages

In a co-working space in Johannesburg’s Rosebank district, computer scientist Jade Abbott opened a new window on her computer and asked ChatGPT to count from one to ten in isiZulu. This is the language spoken by more than ten million people in their native South Africa. The results were “mixed and fun,” Abbott reports.

She then typed a few sentences in isiZulu and asked the chatbot to translate them into English. Again, the answers weren’t even close to correct. While there are efforts to include specific languages ​​in AI models even when there is little data available for training, for Abbott, these results show that the technology “still doesn’t truly capture our languages.”

Her experience reflects the situation of Africans who do not speak English. Many language models such as ChatGPT are not well suited to languages ​​with smaller numbers of speakers, particularly African languages. That’s why Abbott and biomedical engineer Pelonomi Moiloa are trying to use machine learning to develop tools specifically for Africans in their new start-up Lelapa AI.

AI tool for Afrikaans, Sesotho and isiZulu

The new AI tool Vulavula, which Lelapa AI introduced in mid-November, converts speech into text and recognizes names of people and places in written texts. The last feature could be useful for summarizing a document or searching for a person online. It can currently recognize four languages ​​spoken in South Africa: isiZulu, Afrikaans, Sesotho and English. The team is also working to include other languages ​​from across Africa.

The tool can be used standalone or integrated into existing AI tools such as ChatGPT and online chatbots. The team hopes that Vulavula, which means “speak” in Xitsonga, will also make accessible to all those tools that do not currently support African languages.

The lack of AI tools that work for African languages ​​and recognize African names and places excludes African people from economic opportunities, says Moiloa, CEO and co-founder of Lelapa AI. For them, working on Africa-centric AI solutions is a way to help others in Africa realize the immense potential benefits of AI technologies. “We’re trying to solve real problems and put power back in the hands of the people,” she says.

“We can’t wait for big tech”

There are thousands of languages ​​in the world, 1,000 to 2,000 of which are spoken in Africa alone: ​​it is estimated that a third of all the world‘s languages ​​are spoken on the continent. But although only five percent of the world‘s population has English as their native language, this language dominates the Internet – and now also AI tools.

There are already some efforts to correct this imbalance. OpenAI’s GPT-4 has included smaller languages ​​such as Icelandic. As of February 2020, Google Translate supports five new languages ​​spoken by approximately 75 million people. But the translations are superficial, the tool often misunderstands African languages, and it is still a long way from an accurate digital representation of African languages, African AI researchers say.

Earlier this year, at a leading African AI conference in Kigali, Rwanda, Ethiopian computer scientist Asmelash Teka Hadgu conducted the same experiments Abbott did with ChatGPT. When he asked the chatbot questions in his native Tigrinya language, the answers he received were gibberish. “He generated words that didn’t make sense,” says Hadgu, who co-founded the Berlin-based AI startup Lesan, which develops translation tools for Ethiopian languages.

Lelapa AI and Lesan are just two of the startups developing speech recognition tools for African languages. In February, Lelapa AI received $2.5 million in seed funding. The next round of financing is planned for 2025. However, many African entrepreneurs say they face major hurdles: lack of funding, limited access to investors and difficulties in training AI to learn different African languages. “AI is the least funded among African startups,” says Abake Adenle, the founder of London-based startup Ajala, which offers voice automation for African languages.

AI startups working to develop products for African languages ​​are often ignored by investors because the potential market is small, there is a lack of political support and poor internet infrastructure, according to Hadgu. However, Hadgu says small African startups like Lesan, GhanaNLP and Lelapa AI play an important role: “Big tech companies don’t pay attention to our languages,” he says, “but we can’t wait for them.”

Offline training for African AI

Lelapa AI is trying to create a new paradigm for AI models in Africa, says Vukosi Marivate, a data scientist on the company’s AI team. Instead of just using the internet to collect data to train the model like Western companies, Lelapa AI works both online and offline with linguists and local communities to collect data, annotate it and identify use cases where the tool could be problematic.

Bonaventure Dossou, a researcher specializing in natural language processing (NLP) at Lelapa AI, says that working with linguists allows for the development of a context-specific and culturally relevant model. “Incorporating cultural sensitivity and linguistic perspectives makes the technical system better,” says Dossou. For example, Lelapa AI’s AI team has developed algorithms to analyze mood and tone tailored to specific languages.

Marivate and his colleagues at Lelapa AI envision a future where AI technologies work for and represent Africans. In 2019, Marivate and Abbott founded Masakhane, a grassroots initiative designed to promote natural language processing (NLP) research in African languages. Thousands of volunteers, programmers and researchers are now working together to develop NLP models for Africa.

It’s important that Vulavula and other AI tools are developed by Africans, for Africans, says Moiloa: “We are the guardians of our languages. We should be the developers of technologies that work for our languages.”

