top of page

In 2025, KurdAI is setting its sights on more transformative goals

Writer's picture: Karim RahimiKarim Rahimi

Updated: Jan 12


Over the past two years, despite significant challenges, KurdAI has successfully trained and fine-tuned state-of-the-art AI models tailored for the Kurdish language. These include:


  • Text embedding, generation, and translation models across Kurdish Sorani, Kurdish Kurmanji, and English.

  • Voice and speech models for Sorani, Kurmanji, and partial support for Kalhori, Hawrami, and other dialects and planing for more models ahead.


These models are customized and continuously improved based on invaluable feedback from linguists and users. Yet, it’s clear that AI tools are not evolving equitably to cover all languages, including Kurdish. This imbalance in development and the lack of significant support from the Kurdish community itself remains a challenge we must overcome.

We deeply appreciate and honor the tireless efforts of individuals and teams, past and present, who have worked to preserve and promote the Kurdish language. Whether through profit or non-profit initiatives, your contributions and making the databases and datasets publicly available are the foundation of our progress. You are our true heroes!


Language is the essence of identity for every nation. At KurdAI, our passion for protecting our language, history, and culture is the driving force behind our efforts. This courage and motivation, coupled with the unwavering support of those who provide feedback, advice, or financial backing, keep us moving forward. However, to prevent new forms of discrimination and assimilation, platforms like KurdAI and other similar initiatives must push the boundaries of AI development in every aspect as possible and more significantly. While we respect and value all languages and dialects, we stand firmly against the domination of one over another.


In 2024 KurdAI has provided its services to hundreds of users, linguists and several Kurdish institutions. These services include Jire Kurdish chatbot, Wergêr as our AI based translator across Kurdish-Sorani, Kurdish-Kurmanji and English, and Transliteration of Kurdish-Sorani to latin alphabets. We apologize if not being able to response on time to all emails and messages and/or if you have been in a waiting list or suffered from the server's error messages. We hope to provide better services in 2025 with more focus on research and supporting broader number of Kurdish linguists and Kurdish schools, universities and institutions.


In 2025, and after a long term safety checks, KurdAI is proud to start releasing and open sourcing the developed AI models, and with a strong plan and unwavering motivation, we aim to further train new models and fine-tune the current models to expand their capabilities, covering more Kurdish dialects and deeper aspects of Kurdish culture and history.


By building on this solid foundation, we envision creating tools that empower research, development, and production while fostering a richer digital representation of the Kurdish language and heritage. However, the success of this ambitious vision relies on continued collaboration, feedback, and support from an active community and beyond. Together, we can push the boundaries of AI for the benefit of Kurdish culture and identity.

Additionally, we aim to show case, provide modes and data and eliminate any excuses from major software and operating system companies that claim a lack of data or resources as a barrier to supporting Kurdish language and its dialects. Even if those challenges hold some truth, we are determined to pave the way for greater inclusivity.


Hereby, we release and give primary download access to our highly efficient Automatic Speech Recognition (ASR Sorani version) to developers and anyone send a request on our Huggingface page. More detailed instructions for regular users and linguists and how to use the model will be published soon. This Kurdish-Sorani-ckb-ASR model is build by retraining the state of the art base model of mms-1b (developed by Meta) with one billion parameters using Mozilla Common Voice dataset (thanks to all collaborators) for Kurdish Sorani on Tesla A100 GPUs.


This is the link to the model on Huggingface account of KurdAI-Academy: 


We are working on the Kurmanji version of this model and will release as soon as ready for safety test. We have designed and aim to train a new ASR model covering almost all dialects and sub-dialects of Kurdish Sorani and Kurmanji and if anyone is able to help please reach out to us.


Let’s work together to ensure that Kurdish finds its rightful place in the digital age and even support other languages similar to us. Together, we can build tools that not only preserve our language but also empower future generations. Thank you for being part of this journey!


💡 Your feedback, support, and collaboration are invaluable. Let’s make 2025 a breakthrough year for Kurdish AI and to not stay behind!

Stay in touch via our email and social media:

And follow us on X.com / @KurdAI_Official





Komentarze


Komentowanie zostało wyłączone.
bottom of page