Tajikistan has taken a major step forward in digital innovation with the launch of its first-ever national artificial intelligence language model, SoroLLM.  Uniquely designed to understand and process the Tajik language in all its diversity, the model can recognize not only standard Tajik but also its various regional dialects.

Developed by the research team at zehnlab.ai, SoroLLM stands out as the first neural network created specifically for the Tajik language.  Unlike global AI models such as GPT or LLaMA, which offer limited or no support for Tajik, SoroLLM has been built from the ground up to accommodate the language’s distinctive syntax, rare vocabulary, and diverse pronunciation styles.

The groundbreaking project was officially presented to President Emomali Rahmon on June 25 during the opening ceremony of Tajikistan’s first AI Computing Resource Center.  The event marked a milestone in the country’s digital transformation and underscored the importance of local technology solutions.

“Our goal was not only to enable the model to recognize Tajik but also to capture the full spectrum of its dialects—from northern accents to the languages of the Pamirs,” said the developers.

Looking ahead, the team plans to integrate multimodal capabilities, allowing SoroLLM to process not just text but also audio and video inputs.  As part of the ongoing development, the creators are inviting citizens to contribute by sharing information about their regional dialects through a simple Google form, accessible via a provided link.

With SoroLLM, Tajikistan is setting a new precedent for linguistic inclusion in AI, putting its national language and cultural identity at the forefront of technological progress.