Adobe Media and Data Science Research (MDSR) Laboratory
Adobe Media and Data Science Research (MDSR) Laboratory
News
Publications
People
Join us!
Collaborators
On the Effect of Instruction Tuning Loss on Generalization
Instruction Tuning has emerged as a pivotal post-training paradigm that enables pre-trained language models to better follow user …
Anwoy Chatterjee
,
Kowndinya Renduchintala
,
Sumit Bhatia
,
Tanmoy Chakraborty
PDF
Cite
DOI
POSIX: Prompt Sensitivity Index for Large Language Models
Despite their remarkable capabilities, Large Language Models (LLMs) are found to be surprisingly sensitive to minor variations in …
Anwoy Chatterjee
,
Kowndinya Renduchintala
,
Sumit Bhatia
,
Tanmoy Chakraborty
PDF
Cite
Poster
Video
DOI
SMART: Submodular Data Mixture Strategy for Instruction Tuning
Instruction Tuning involves finetuning a language model on a collection of instruction-formatted datasets in order to enhance the …
Kowndinya Renduchintala
,
Sumit Bhatia
,
Ganesh Ramakrishnan
PDF
Cite
Poster
Video
DOI
Align via actions: Learning behavior aligns LLMs with human opinions in zero-shot
Large language models (LLMs) have become ubiquitous in various applications, but aligning them with societal expectations remains …
Aanisha Bhattacharyya
,
Susmit Aggarwal
,
Yaman Kumar Singla
,
Tarun Menta
,
Nikitha SR
,
Balaji Krishnamurthy
PDF
All should be equal in the eyes of LMs: Counterfactually Aware Fair Text Generation
Fairness in Language Models (LMs) remains a longstanding challenge, given the inherent biases in training data that can be perpetuated …
Pragyan Banerjee
,
Abhinav Java
,
Surgan Jandial
,
Simra Shahid
,
Shaz Furniturewala
,
Balaji Krishnamurthy
,
Sumit Bhatia
PDF
INGENIOUS: Using Informative Data Subsets for Efficient Pre-Training of Language Models
A salient characteristic of pre-trained language models (PTLMs) is a remarkable improvement in their generalization capability and …
Kowndinya Renduchintala
,
Krishnateja Killamsetty
,
Sumit Bhatia
,
Milan Aggarwal
,
Ganesh Ramakrishnan
,
Rishabh Iyer
,
Balaji Krishnamurthy
PDF
Cite
Poster
Video
DOI
CoSe-Co: Text Conditioned Generative CommonSense Contextualizer
Pre-trained Language Models (PTLMs) have been shown to perform well on natural language tasks. Many prior works have leveraged …
Rachit Bansal
,
Jivat Meet Kaur
,
Milan Aggarwal
,
Sumit Bhatia
,
Balaji Krishnamurthy
PDF
LM-CORE: Language Models with Contextually Relevant External Knowledge
Large transformer-based pre-trained language models have achieved impressive performance on a variety of knowledge-intensive tasks and …
Rachit Bansal
,
Jivat Meet Kaur
,
Milan Aggarwal
,
Sumit Bhatia
,
Balaji Krishnamurthy
PDF
Cite
×