In addition to the model, the Inkuba-Mono and Inkuba-Instruct datasets are now publicly available.
In addition to the model, the Inkuba-Mono and Inkuba-Instruct datasets are now publicly available. Photo/ Courtesy

Africa’s first multilingual small language model (SLM), InkubaLM, has achieved a major breakthrough—slashing its size by 75% while improving performance. The milestone, driven by a wave of African-led innovation, marks a significant step toward building efficient, accessible AI tools for low-resource settings across the continent.

The advancement was the result of the Buzuzu-Mavi Challenge, a global machine learning competition hosted by South Africa-based Lelapa AI in collaboration with Zindi. The challenge invited AI experts worldwide to compress InkubaLM without sacrificing its ability to operate across multiple African languages.

““It is a joy and a privilege for us at Zindi to partner with Lelapa AI on the Buzuzu-Mavi Challenge. Seeing the impact that our incredible community of AI builders can have on a truly African problem is inspiring and rewarding in its own right, but even better, these solutions showcase what African innovators can do in the language model space,” said Celina Lee, co-founder and CEO, Zindi. “In a world where the state of the art requires ever larger language models, we’re proud to show the world that more can be done with less.” 

InkubaLM-0.4B is a compact language model trained specifically on five widely spoken African languages: IsiZulu, Yoruba, Hausa, Swahili, and IsiXhosa, with approximately 364 million speakers. Two datasets were used – Inkuba-Mono and Inkuba-Instruct.

Built from scratch, InkubaLM-0.4B was trained on a total of 2.4 billion tokens—1.9 billion of which come from the African languages. With 0.4 billion parameters and a vocabulary size of 61,788, the model adopts an architecture similar to MobileLLM. 

Despite its modest size, the researchers say InkubaLM-0.4B is among the smallest publicly available models of its kind, trained on significantly less data than comparable alternatives.

“This challenge isn’t simply about technical progress, it reflects our deeper mission at Lelapa AI: to build AI that is inclusive, accessible, and grounded in African realities. The Buzuzu-Mavi Challenge affirms what we’ve always believed – when AI is designed with Africa in mind, it becomes both technically excellent and deeply transformative,” said  Pelonomi Moiloa, CEO, Lelapa AI.  “And when African talent is trusted with meaningful challenges, the results are not just outstanding, they’re a glimpse into the future we’re building for and from the continent.” 

Why Smaller Models Matter

In a continent where internet access averages just 33% and 70% of people use entry-level smartphones, lightweight AI isn’t a luxury, it’s a lifeline. 

Smaller models, like InkubaLM, can run on affordable devices, function without constant connectivity, and power real-world solutions in translation, education, agriculture, and customer service.

The model can be applied to various tasks including text generation and downstream applications using zero-shot or few-shot learning. The model can be fine-tuned with instruction datasets for more complex tasks, and is compatible with CPU, GPU, and multi-GPU setups—making it accessible even on standard laptops.

In addition to the model, the Inkuba-Mono and Inkuba-Instruct datasets are now publicly available. These resources are designed to train or fine-tune models for a range of NLP tasks in the five languages, including machine translation, sentiment analysis, news classification, and part-of-speech tagging. 

Developers say the Inkuba project offers practical tools to address the underperformance of conventional large language models in African contexts.

The Winners

There were over 490 participants from 61 countries in the competition, with all top winners hailing from Africa, a strong endorsement of the continent’s AI innovation potential.

🥇 Yvan Carré (Cameroon): Compressed InkubaLM using adapter heads (add-ons that specialise in specific tasks), quantisation (shrinking the model’s memory needs), and knowledge distillation (training a smaller model to mimic a larger one), making it leaner without sacrificing capability.

🥈 Stefan Strydom (South Africa): Cut down the model to just 40M parameters by trimming vocabulary (removing infrequent words), reducing layers (streamlining the structure), and sharing embeddings (reusing components to save space).

🥉 Team AI_Buzz – Abdourahamane Ide Salifou, Mubarak Muhammad, and Victor Olufemi (Nigeria & Niger): Built a 177M-parameter student model by blending datasets (combining different sources for broader learning) and applying distillation, achieving both size reduction and solid performance.

What’s Next?

The release of InkubaLM and its accompanying datasets is expected to pave the way for more advanced and inclusive language technologies across Africa. Designed as a foundation for future development, InkubaLM offers a lightweight, adaptable model that researchers and developers can further train to expand its capabilities across a range of tasks in the five African languages.

“The most promising submissions will inform future InkubaLM releases. But the journey doesn’t end here. InkubaLM remains open-source and available for innovators everywhere,” Lelapa AI and Zindi said in a joint press statement. “Anyone can explore, improve, and make InkubaLM even smaller, leaner, and more powerful for African contexts.”

Alongside the model, the Inkuba datasets are expected to improve the performance of other existing language models. As large language models continue to fall short in handling African languages, the Inkuba project hopes to equip NLP practitioners with the tools needed to build more accurate, culturally relevant applications—laying the groundwork for broader AI adoption across the continent.

Stay ahead in the world of AI, business, and technology by visiting Impact AI News for the latest news and insights that drive global change.


Discover more from Impact AI News

Subscribe to get the latest posts sent to your email.

Discover more from Impact AI News

Subscribe now to keep reading and get access to the full archive.

Continue reading