Dubai bids to answer the data privacy riddle

Digital Dubai’s team is on a mission to pave the way for synthetic data, showcasing itself as an innovator among the Arabian Gulf states


Joydeep Sengupta

  • Follow us on
  • google-news
  • whatsapp
  • telegram


Photo by Shihab
Photo by Shihab

Published: Thu 13 Oct 2022, 8:18 PM

Last updated: Thu 13 Oct 2022, 8:31 PM

Digital Dubai has released its pioneering work on synthetic data, which will be used to increase Dubai's access to cutting-edge technologies that promises to further accelerate the emirate’s digitisation drive.

“We want to use it to augment our existing data publishing and value creation model, so that data flows to where it is needed whilst data privacy is maintained,” said Younus Al Nasser, Chief Executive Officer (CEO), Dubai Data Establishment, Digital Dubai, whose team has been driving this pioneering project since the research was undertaken two years ago.

He explained that data privacy protection is often a leap of faith in the past, forcing a compromise between anonymity and security.

“An emerging alternative to existing anonymisation techniques, synthetic data replaces rather than modifies data, whilst maintaining its utility and relational integrity,” Al Nasser said.

We want to use synthetic data to augment our existing data publishing and value creation model, so that data flows to where it is needed whilst data privacy is maintained

He mentioned how this artificial intelligence AI-enabled exercise can create innovation-friendly data environments by adhering to the following steps:

  • eliminating the wait for real data, adding speed and flexibility to agile development
  • increasing collaborative analytics between organisations by facilitating data sharing
  • enabling fourth industrial revolution (4IR) technologies like machine learning, by increasing data volumes and reducing data labelling burdens

Al Nasser touched on adoption at scale, which is the biggest challenge for an ambitious project like this to succeed and the first-of-its-kind that has been undertaken by any government entity in the Arabian Gulf.

“To kickstart our discovery phase, we want to keep it practical, purposeful, and focussed on use cases that fail to get over familiar data or governance hurdles,” he said.

The feasibility of the project is of paramount importance.

So, the dedicated team has produced a research report produced with the United Kingdom (UK)-based organisation, Faculty, built around synthetic data. The report makes a strong case for both Dubai Government’s leadership as well as data science practitioners. It’s an implementation framework — a set of canvasses such as data governance and infrastructure — to guide data practitioners about when and how to use synthetic data responsibly. Significantly, it’s a proof-of-concept synthetic data sandbox with Microsoft UAE and Avenade — an environment in which synthetic data use cases can be safely generated and tested.

What’s synthetic data?

Synthetic data is data manufactured artificially but realistically representing real world data, while remaining distinct. Algorithms create synthetic data used in model datasets for testing or training purposes. The synthetic data can mimic operational or production data and help train machine learning (ML) models or test out mathematical models.

Synthetic data offers several important benefits: it minimises the constraints associated with the use of regulated or sensitive data, it can be used to customise data to match conditions that real data does not allow, and it can be used to generate large training datasets without requiring manual labeling of data.

More news from