Publications

Peer-reviewed research and technical contributions advancing language AI for low-resource languages.

2023

How good are Large Language Models on African Languages?

Jessica Ojo, Kelechi Ogueji, Pontus Stenetorp, David Ifeoluwa Adelani

arXiv

FonMTL: Towards Multitask Learning for the Fon Language

Bonaventure F. P. Dossou, Iffanice B. Houndayi, Pamely Zantou, Gilles Hacheme

EMNLP 2023

Pretrained Vision Models for Predicting High-Risk Breast Cancer Stage

Bonaventure F. P. Dossou, Yeno K. S. Gbenou, Miglanche Ghomsi Nono

ICLR 2023

AfriNames: Most ASR models "butcher" African Names

Tobi Olatunji, Tejumade Afonja, Bonaventure F. P. Dossou, Atnafu Lambebo Tonja, Chris C. Emezue, Amina Mardiyyah Rufai, Sahib Singh

Interspeech 2023

GFlowOut: Dropout with Generative Flow Networks

Dianbo Liu, Moksh Jain, Bonaventure F. P. Dossou, Qianli Shen, Salem Lahlou, Anirudh Goyal, Nikolay Malkin, Chris Chinenye Emezue, Dinghuai Zhang, Nadhir Hassen, Xu Ji, Kenji Kawaguchi, Yoshua Bengio

ICML 2023

AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages

Odunayo Ogundepo, T. Gwadabe, Clara Rivera, J. Clark, Sebastian Ruder, David Ifeoluwa Adelani, Bonaventure F. P. Dossou et al.

EMNLP 2023

AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR

Tobi Olatunji, Tejumade Afonja, Aditya Yadavalli, C. Emezue, Sahib Singh, Bonaventure F. P. Dossou et al.

EMNLP 2023

Adapting to the Low-Resource Double-Bind: Investigating Low-Compute Methods on Low-Resource African Languages

Colin Leong, Herumb Shandilya, Bonaventure F. P. Dossou, Atnafu Lambebo Tonja et al.

ICLR 2023

MasakhaNEWS: News Topic Classification for African languages

David Ifeoluwa Adelani, Marek Masiak, Israel Abebe Azime, Jesujoba Oluwadara Alabi, Atnafu Lambebo Tonja, Bonaventure F. P. Dossou et al.

AACL 2023 (Best Paper Award)

MasakhaPOS: Part-of-Speech Tagging for Typologically Diverse African Languages

Cheikh M. Bamba Dione, David Ifeoluwa Adelani, Peter Nabende, Bonaventure F. P. Dossou et al.

ACL 2023

The Less the Merrier? Investigating Language Representation in Multilingual Models

Nigatu Hellina Hailu, Atnafu Lambebo Tonja, Jugal Kalita

EMNLP 2023

PuoBERTa: Training and evaluation of a curated language model for Setswana

Vukosi Marivate, Moseli Mots'Oehli, Valencia Wagnerinst, Richard Lastrucci, Isheanesu Dzingirai

SACAIR 2023

MphayaNER: Named Entity Recognition for Tshivenda

Rendani Mbuvha, David I. Adelani, Tendani Mutavhatsindi, Tshimangadzo Rakhuhu, Aluwani Mauda, Tshifhiwa Joshua Maumela, Andisani Masindi, Seani Rananga, Vukosi Marivate, Tshilidzi Marwala

arXiv

Fine-Tuning Multilingual Pretrained African Language Models

Rozina Myoya, Fiskani Banda, Vukosi Marivate, Abiodun Modupe

ICLR 2023

Unsupervised Cross-lingual Word Embedding Representation for English-isiZulu

Derwin Ngomane, Vukosi Marivate, Jade Abbott, Rooweither Mabuya

RAIL 2023

Consultative engagement of stakeholders toward a roadmap for African language technologies

Kathleen Siminyu, Jade Abbott, Kola Tubosun, Angela Thandizwe Mthembu, Arshath Ramkilowan, Babatunde Oladimeji

Patterns (Cell Press)

Preparing the Vuk'uzenzele and ZA-gov-multilingual South African multilingual corpora

Richard Lastrucci, Isheanesu Dzingirai, Jenalea Rajab, Andani Madodonga, Matimba Shingange, Daniel Njini, Vukosi Marivate

arXiv