I am a Principal Applied Researcher in the Azure AI Platform group in Hyderabad, India with my research focussing on multilingual and mulimodal language technologies. I work with the Azure Speech team and I have worked with the Azure Machine Translation team for a long time previously. I am a founding member and co-lead of AI4Bharat, a research center based in IIT Madras that works to drive advances and build resources for Indian language NLP. I am honored to be currently serving as an area chair for ACL Rolling Review (ARR) in the multilinguality and low-resource/efficient NLP areas. I have also served as an adjunct faculty in the Department of Computer Science, IIT Madras in the past.
My research areas are Natural Language Processing, Machine Learning, Information Extraction, and Retrieval.
My research interests include multilingual learning and LLMs, post-training of LLMs, reasoning and evaluation in LLMs, representation learning, NLP for related languages, machine translation, and transliteration. I am interested in building tools and resources for Indian language NLP.
Over the last decade, I have built/contributed to large-scale, broad-coverage resources like the Indic NLP Library, IndicTrans/Sata-Anuvaadak Translation systems, IndicLLMSuite, Airavata LLM, IIT Bombay Parallel Corpus, Samanantar Corpus, Indic NLP/NLG Suite, and Aksharantar/BrahmiNet transliteration corpora.
I completed my Ph.D. in 2018 at the Department of Computer Science and Engineering, IIT Bombay. I did my research under the guidance of Prof. Pushpak Bhattacharyya at the Center for Indian Language Technology. My doctoral research work explored various facets of machine translation and transliteration between related languages.
News
14 Dec 2025: Honoured to present the Prof. Pushpak Memorial Lecture at IndoML 2025 on 20th Dec 2025 21 Nov 2025: Azure Model Router is now GA with lots of new features and support for many new model families. It can route between many popular models out of the box with significant cost saving and high quality - please give it a try. Happy to have been part of the journey. [More info...] 19 Nov 2025: Invited Talk on Language Technology in AI and NLP: Indian Languages in the Digital Age at Bharatiya Bhasha Sangam, English and Foreign Languages University, Hyderabad. [slides] 17 Nov 2025: Honoured to be invited to be part of the Textbook Development Team (TDT) for Grade 11 and 12 Textbooks in Foundations and Methods of Artificial Intelligence for NCERT. Looking forward to contribute to building a basic AI syllabus that sparks the curiosity of young students in this exciting area of technology. 8 Nov 2025: Presented our tutorial on Data and Model Centric Approaches for Expansion of Large Language Models to New languages at EMNLP, Suzhou. [website] [slides] [video] 28 Oct 2025: 2 new pre-prints on reasoning - one on multilingual reasoning (RiddleBench: A New Generative Reasoning Benchmark for LLMs) and another on a new challenging Puzzle benchmark to evaluate reasoning models (The Reasoning Lingua Franca: A Double-Edged Sword for Multilingual AI ) 25 Oct 2025: Paper accepted to AACL 2025. Pralekha: Cross-lingual Document Alignment for Indic Languages [pre-print] 5 Oct 2025: Shocked to hear the untimely demise of my Ph.D advisor and a titan of Indian language NLP - Prof. Pushpak Bhattacharyya. My little tribute to him [HERE] 8 Sep 2025: Course Lectures on (a) Language Modeling and Seq-to-Seq Modeling, (b) The Transformer Model for Prof. Pushpak Bhattacharyya's course of Deep Learning for NLP (CS772) - 2025 at IIT Bombay. 3 Jul 2025: Talk on introduction to reasoning models at OdiaGen 2025 [slides]. 24 Jun 2025: Tutorial on Building Multilingual NLP datasets at scale at IASNLP Summer School at IIIT Hyderabad. [slides] 22 May 2025: Hands-on tutorial from our team to train reasoning models at scale with open-source software and leveraging the power of Azure ML to make the process easy at Microsoft Build 2025. [Demo at 27 min] [Jupyter Notebook] 16 May 2025: 3 papers accepted to ACL 2025. Congratulations to all my collaborators and students! Continuing with our efforts to improve Indian language NLP and understanding multilingual models! [BhasaAnuvaad] [CIA: Cross-lingual LLM Evaluation] [RomanLens] 12 Apr 2025: Happy to be part of panel discussions at AI Days 2025, Hyderabad on Indian language NLP and LLMs. Apr 2025 Honoured to be part of the Academic Council at IIIT-Hyderabad. Apr 2025 Honoured to be part of the CLD Program Curriculum and Review Committee at IIIT-Hyderabad. 8 Feb 2025: Invited talk on An Introduction to Reasoning Models with DeepSeek R1 as part of CSE Department Day, IIT Hyderabad. The talk introduces reasonin gmodels, particularly DeepSeek R1 and open-source reasoninng efforts initiated since R1's release. [slides] 12 Jan 2025: Lecture on Multilingual Language Modeling as part of winter school on "Deep Learning for Vision and Language Modelling" at IIT Guwahati. The talk covers various aspects of multilingual learning from the beginning of the deep learning era to current multilingual LLMs. [slides]Moreā¦