Conference | Adobe Media and Data Science Research (MDSR) Laboratory

HyHTM: Hyperbolic Geometry-based Hierarchical Topic Model

Hierarchical Topic Models (HTMs) are useful for discovering topic hierarchies in a collection of documents. However, traditional HTMs …

Simra Shahid, Tanay Anand, Nikitha SR, Sumit Bhatia, Balaji Krishnamurthy, Nikaash Puri

LOCATE: Self-supervised Object Discovery via Flow-guided Graph-cut and Bootstrapped Self-training

Learning object segmentation in image and video datasets without human supervision is a challenging problem. Humans easily identify …

Silky Singh, Shripad Deshmukh, Mausoom Sarkar, Balaji Krishnamurthy

MobiVSR - Mobile Application for Visual Speech Recognition

Visual speech recognition (VSR) is the task of recognizing spoken language from video input only, without any audio. VSR has many …

Nilay Srivastava, Astitwa Saxena, Yaman Kumar Singla, Debanjan Mahata, Rajiv Ratn Shah, Amanda Stent, Roger Zimmerman

One-Shot Doc Snippet Detection: Powering Search in Document Beyond Text

Active consumption of digital documents has yielded scope for research in various applications, including search. Traditionally, …

Abhinav Java, Shripad Deshmukh, Milan Aggarwal, Surgan Jandial, Mausoom Sarkar, Balaji Krishnamurthy

Persuasion Strategies in Advertisements

Modeling what makes an advertisement persuasive, i.e., eliciting the desired response from consumer, is critical to the study of …

Yaman Kumar Singla, Rajat Jha, Arunim Gupta, Milan Aggarwal, Aditya Garg, Ayush Bhardwaj, Tushar, Balaji Krishnamurthy, Rajiv Ratn Shah, Changyou Chen

VGFlow: Visibility guided flow network for human reposing

The task of human reposing involves generating a realistic image of a person standing in an arbitrary conceivable pose. There are …

Rishabh Jain, Krishna Kumar Singh, Mayur Hemani, Jingwan Lu, Mausoom Sarkar, Duygu Ceylan, Balaji Krishnamurthy

What do audio transformer models hear? Probing Acoustic Representations for Language Delivery and its Structure

Transformer models across multiple domains such as natural language processing and speech form an unavoidable part of the tech stack of …

Yaman Kumar Singla, Jui Shah, Rajiv Ratn Shah, Changyou Chen

Approximate Information State for Approximate Planning and Reinforcement Learning in Partially Observed Systems

We propose a theoretical framework for approximate planning and learning in partially observed systems. Our framework is based on the …

Jayakumar Subramanian, Amit Sinha, Raihan Seraj, Aditya Mahajan

Automated Speech Scoring System Under The Lens: Evaluating and interpreting the linguistic cues for language proficiency

English proficiency assessments have become a necessary metric for filtering and selecting prospective candidates for both academia and …

Manraj Grover, Pakhi Bamdev, Yaman Kumar Singla, Payman Vafaee, Mika Hama, Rajiv Ratn Shah

CoSe-Co: Text Conditioned Generative CommonSense Contextualizer

Pre-trained Language Models (PTLMs) have been shown to perform well on natural language tasks. Many prior works have leveraged …

Rachit Bansal, Jivat Meet Kaur, Milan Aggarwal, Sumit Bhatia, Balaji Krishnamurthy