Artificial intelligence (AI) has seen rapid advancements globally, evolving from ChatGPT-3.5 to more sophisticated models like GPT-4. Despite these global strides, Africa’s AI landscape has progressed more slowly, though countries like Nigeria have initiated policies to foster AI development. Amidst this backdrop, Ijemma Onwuzulike has emerged as a pioneering figure with the creation of IgboSpeech, the first Igbo voice-to-text AI model.
Ijemma Onwuzulike, an American-born engineer with Nigerian heritage, embarked on the journey to build IgboSpeech with a mission to normalize the casual use of the Igbo language, much like Spanish or French. Her goal was to shift the language from academic confines to everyday communication. This passion for language led her to explore the mechanics of language learning technologies, revealing the extensive data and linguistic research behind platforms like Duolingo.
The Genesis of the Igbo API
In 2020, Onwuzulike developed the Igbo API, a foundational digital infrastructure aimed at supporting various Igbo language applications. The API functions as a comprehensive digital dictionary, offering data on Igbo words, including definitions, synonyms, antonyms, and usage examples. The creation of this API was fraught with challenges due to the inconsistencies in existing Igbo dictionaries. Through meticulous data curation, Onwuzulike and her team built an API now encompassing over 5,000 words, nearly 30,000 sentences, and hundreds of audio recordings and proverbs.
The Igbo API serves as the backbone for language applications like Nkowa Okwu, an Igbo language learning platform founded by Onwuzulike. Her deep-seated interest in technology and languages was nurtured at Dartmouth College, where she studied Computer Science, Japanese, and literature, setting the stage for her innovative work in language technology.
Launching IgboSpeech
On July 1, 2024, the demo website for IgboSpeech was launched. This automatic speech recognition (ASR) model is designed to transcribe Igbo speech into text, providing significant utility in generating subtitles for Igbo movies, YouTube videos, and note-taking applications. Onwuzulike explains that focusing on voice-to-text, rather than text-to-voice or translation, offers immediate benefits for Igbo speakers and translators.
IgboSpeech’s development is sustained by grants, with Onwuzulike’s team seeking funding from the Lacuna Fund, which supports projects in antimicrobial resistance and natural language processing. Securing such grants would enable the team to engage top-tier software engineers, audio recorders, and Igbo linguists, ensuring the project’s continued success.
Positive Reception and Future Aspirations
The reception of IgboSpeech and the Igbo API has been overwhelmingly positive. Nigerian engineers and developers have shown great enthusiasm for contributing to open-source projects aimed at preserving and promoting African languages. Onwuzulike’s work not only demonstrates technological prowess but also inspires a broader movement to develop AI solutions tailored for Africa.
Looking ahead, Onwuzulike and her team plan to expand IgboSpeech’s capabilities to include text-to-voice and translation features. Their ultimate aim is to create a comprehensive suite of tools that support the Igbo language in various digital formats, making it accessible and functional for a wider audience.
Sources
- Techpoint Africa: “How this engineer built the first Igbo voice-to-text AI model”
- Dartmouth College: Academic Institution
- Lacuna Fund: Grant Funding