Detecting hate speech in memes, creating an AI-human interactive dubbing platform, and doing research on deep learning-based text-to-speech systems in 13 Indian languages are some of the breakthrough projects of a graduate of the Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI).
Ahead of commencement exercises on Sunday, Gokul Karthik Kumar, a master’s degree student of computer vision, spoke to Khaleej Times about his journey at the pioneering institution.
The university, he said, gave him the freedom to explore several areas of artificial intelligence (AI), including his passion for natural language processing.
“While my two-year major is in computer vision, my supervisor has supported me in pursuing projects in other domains like natural language processing and speech processing, which has been immensely fulfilling and helped me identify the areas that I’m currently passionate about.”
A native of the Indian state of Tamil Nadu, Kumar underlined that his experience at the MBZUAI has been “transformative” and prepared him well for a future career in AI research and development.
“I have learned from some of the most knowledgeable professors in artificial intelligence. I have also enhanced my research skills from problem identification to research proposal to presentation.”
Kumar has an extensive background in machine learning across text, image, speech, and time-series, having worked with top technology organisations like Microsoft Research India, TCS Research, MBZUAI, and IIT Madras. He has won numerous hackathons, including the IEEE SLT 2022 international hackathon in Qatar, as well as eight national-level hackathons in the UAE and India, and a US patent.
He has co-authored articles that have been published in major conferences such as IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023), Association for Computational Linguistics (ACL 2022) Workshop, Empirical Methods in Natural Language Processing (EMNLP 2022) Workshop, and International Joint Conference on Neural Networks (IJCNN).
Following the commencement ceremony, he will travel to Greece to attend ICASSP 2023, where he will present his research paper titled Towards Building Text-To-Speech Systems for the Next Billion Users.
“This project was initiated during my summer internship at Microsoft Research India, where I collaborated with my co-author, Praveen from IIT Madras. Our work involved a systematic evaluation of design choices for text-to-speech systems, leading to the release of state-of-the-art models for 13 Indian languages. Most open-source text-to-speech is available in English but extending it to local languages could reach masses, especially people who don’t know how to read.”
Kumar’s thesis research explores efficient representation methods for multilingual and multimodal data. His work addresses crucial tasks such as question answering, hateful meme classification, text-to-speech, and text-image retrieval. In the current era of social media, where online bullying has become increasingly prevalent, Kumar's research holds importance.
Hateful memes, which encompass hate speech targeted at individuals on social media, pose a concerning challenge. While various techniques exist for classifying such memes, Kumar has devised a straightforward approach that effectively combines image and textual features to predict the probability of hatefulness. This could empower social media platforms to make informed decisions about what should and should not be published.
He has also been involved in the joint development of the award-winning Autodub, a human-in-the-loop AI dubbing platform that aims to eliminate language barriers in educational video content to enhance remote, online learning to all corners of the globe. Autodub seamlessly integrates transcription, translation, voiceover, and background audio separation to create accurate translations and promote accessibility for all. Since many educational videos are primarily in English, this can create a hindrance for non-native English speakers. Autodub offers a viable solution to this challenge.
“What truly excites me about my future career is the opportunity to make a tangible impact. If I can develop something that enhances processes and, consequently, positively influences a significant number of individuals, it would be truly remarkable. Only a few fields or technologies have the power to create something that instantly captures widespread attention and sparks conversations across various communities.”
An ardent follower of IPL cricket team Chennai Super Kings, he recollected his favourite UAE memory of seeing his team clinch the season title in 2021 in Dubai, coinciding with the beginning of his master’s journey. Adding to the excitement, his team won again just days before his commencement ceremony.
His next challenge is to help develop large language models for the UAE – a country that has identified leveraging AI for good as a key priority.
“I will be joining G42’s Inception Institute of Artificial Intelligence (IIAI) as an applied scientist, where my focus will be working collaboratively in a team to develop large language models tailored to UAE-focused applications.”
Kumar is the first to achieve a master’s degree in his family. He also holds a bachelor’s degree in information technology from Anna University, Chennai. He is one of 59 computer vision, machine learning, and natural language processing students graduating as part of Class of 2023.
The integration of algorithms and real-world data resulted in watercolour-style artworks that depict iconic Emirati landmarks and symbols to honour sustainability within the nation
These babies were born at the stroke of midnight across the UAE as the country marked its National Day
Besides the cool greeting, Google also tells the world a bit about the story of the Emirates and why this day is important for the country
Kawkab Mohsin's super-sized paintings catch eyes of visitors during National Day celebrations at Rashid Centre for People of Determination
Here’s a look at all you’d like to know about Air Taxis right from its imminent operations, to costs and benefits
The seven emirates were separate states and roads weren’t tarred, but hardened for vehicles to ply on
Authorities have made the announcement as residents plan to celebrate the last long weekend of the year