Sat, Jan 24, 2026 | Shaban 5, 1447 | Fajr 05:44 | DXB partlycloudy.png21.2°C

UAE launches largest Arabic AI model; here's what Jais 2 can do

While previous Arabic models scored around 62 per cent on evaluation benchmarks Jais 2 delivers what researchers describe as 'state-of-the-art performance'

Published: Tue 9 Dec 2025, 6:26 PM

Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) and its partners on Tuesday released Jais 2, a 70-billion-parameter language model trained on the largest Arabic-first dataset ever assembled — 600 billion Arabic tokens, a scale no other institution has attempted.

The release strengthens the UAE’s position in Arabic AI. While previous Arabic models scored around 62 per cent on evaluation benchmarks, Jais 2 delivers what researchers describe as "state-of-the-art performance across both Arabic and bilingual tasks."

“Arabic has long been underserved in AI development due to the lack of high-quality data,” Professor Preslav Nakov, Department Chair of Natural Language Processing at MBZUAI, told Khaleej Times. “Today marks a defining advancement — a model built with scale, cultural depth, and linguistic fidelity at its core.”

Arabic-first, Not Arabic-adjusted

What makes Jais 2 distinct is its development philosophy. Many global AI models treat Arabic as a secondary language, often translating English datasets or adding thin Arabic layers on top of English-centric systems. Jais 2 was built from scratch around Arabic structure, dialects, and real usage.

“Models developed elsewhere tend to treat Arabic as a peripheral addition,” Nakov said. “Most remain heavily biased toward English, leaving dialects and culturally nuanced contexts poorly modeled.”

The dataset spans Modern Standard Arabic, 17 regional dialects — including Gulf, Emirati, Moroccan, Egyptian, Iraqi — and Arabizi, the Latin-script Arabic widely used online. Jais 2 also incorporates 1.6 trillion English and code tokens, providing it with strong bilingual capabilities essential in a region where code-switching is shown to be part of everyday conversation.

“Code-switching is natural across the Arab world,” Nakov said. “Jais 2 treats it not as an anomaly, but as a normal linguistic pattern.”

Linguistic depth

Jais 2 was trained on more than 427,000 Arabic poems with detailed metadata and semantic annotations — giving it an understanding of classical and contemporary verse that global models lack.

“Arabic poetry is a clear domain where Jais 2 excels,” Nakov said. “Western models simply do not have enough exposure to interpret symbolism or cultural references that Jais handles naturally.”

This cultural grounding is reinforced through a custom-built Arabic vocabulary and safety frameworks designed around regional communication norms rather than Western assumptions.

Built in partnership, built for the region

Developed by Inception (a G42 company), Cerebras Systems, and MBZUAI’s Institute of Foundation Models, Jais 2 was trained and is served entirely on Cerebras hardware — a setup the partners say required a fraction of the computing power used by similar global models.

Beyond its technical achievement, the model represents a significant step forward. For the UAE and wider Arab world, sovereign Arabic AI ensures that the language, dialects, and cultural context are represented accurately in a rapidly digitising world.

“For the region, building sovereign Arabic models ensures representation, cultural fit, and reliability,” Nakov said. “It allows the Arab world to lead rather than follow.”

Open-weight model

Jais 2 is being released as a fully open-weight 70B model — a decision Nakov describes as essential for accelerating local innovation.

“Releasing an open-weight model allows researchers, startups, and governments to build Arabic solutions on top of a state-of-the-art foundation,” he told Khaleej Times.

The release enables fine-tuning for applications across finance, healthcare, education, customer service, media, and government services. Jais 2 is available now through Inceptions HuggingFace page and at at jaischat.ai