Recently, the emergence of pre-trained models (PTMs) has yielded immense success in natural language processing, speech recogni- tion, and computer vision, significantly advancing state-of-the-art of these areas. In the information retrieval (IR) community, pre- trained models have also attracted much attention, and researchers have applied existing pre-training methods or even developed novel pre-training methods for different IR applications. Despite the fact that PTMs have gained significant progress in these areas, there are still many challenges to be addressed when applying these models to real-world IR scenarios. The PLM4IR workshop aims to provide a venue, which can bring together practitioners and researchers from academia and industry (i) to discuss the principles, limitations, and applications of pre-trained language models in IR, and (ii) to foster research on innovative algorithms, novel techniques, and new applications of PTMs to information retrieval.
Jie Tang is a Professor and the Associate Chair of the Department of Computer Science at Tsinghua University. He is a Fellow of the IEEE. His interests include artificial intelligence, data mining, social networks, and machine learning. He served as General Co-Chair of WWW’23, and PC Co-Chair of WWW’21, CIKM’16, WSDM’15, and EiC of IEEE T. on Big Data and AI Open J. He leads the project AMiner.org, an AI-enabled research network analysis system, which has attracted more than 20 million users from 220 countries/regions in the world. He was honored with the SIGKDD Test-of-Time Award, UK Royal Society-Newton Advanced Fellowship Award, NSFC for Distinguished Young Scholar, and KDD’18 Service Award.
WuDao: Pretrain the World
Large-scale pretrained model on web texts have substantially advanced the state of the art in various AI tasks, such as natural language understanding and text generation, and image processing, multimodal modeling. The downstream task performances have also constantly increased in the past few years. In this talk, I will first go through three families: augoregressive models (e.g., GPT), autoencoding models (e.g., BERT), and encoder-decoder models. Then, I will introduce China’s first homegrown super-scale intelligent model system, with the goal of building an ultra-large-scale cognitive-oriented pretraining model to focus on essential problems in general artificial intelligence from a cognitive perspective. In particular, as an example, I will elaborate a novel pretraining framework GLM (General Language Model) to address this challenge. GLM has three major benefits: (1) it performs well on classification, unconditional generation, and conditional generation tasks with one single pretrained model; (2) it outperforms BERT-like models on classification due to improved pretrain-finetune consistency; (3) it naturally handles variable-length blank filling which is crucial for many downstream tasks. Empirically, GLM substantially outperforms BERT on the SuperGLUE natural language understanding benchmark with the same amount of pre-training data.
HAMED ZAMANI is an Assistant Professor in the Manning College of Information and Computer Sciences at the University of Massachusetts Amherst (UMass), where he also serves as the Associate Director of the Center for Intelligent Information Retrieval (CIIR), one of the top academic research labs in Information Retrieval worldwide. Prior to UMass, he was a Researcher at Microsoft. His research focuses on designing and evaluating statistical and machine learning models with applications to (interactive) information access systems, including search engines, recommender systems, and question answering. He is mostly known for his recent work in the areas of neural information retrieval and conversational information seeking. His work has led to over 70 refereed publications in the field, including a few Best Paper and Honorable Mentions, in addition to a number of open-source research tools.
This is title
This is content
PLM4IR will be a forum for discussion about the challenges in applying pre-trained language (PLM) models for information retrieval (IR) field as well as the theory behind the models and applications. The aim of this workshop can be multi-fold: 1) establishing a bridge for communications between academic researchers and industrial researchers, 2) providing an opportunity for researchers to present new works and early results, and 3) discussing the main challenges in designing and applying PLM in practice.
Specifically, although many existing PLM models (e.g. Bert, Ernie and etc) have achieved great success in IR tasks, these models do not consider the IR cues that might benefit the downstream IR tasks. A belief of task-dependent pre-training is that a pre-training objective that more closely resembles the downstream task could lead to better fine-tuning performance with higher efficiency. Therefore, this workshop seeks to pursue two main themes:
Specific issues that emerge here include but not limited to:
These are only a few of the many research questions in applying PLM models in practical applications and understanding the theoretical advantages. WSDM is uniquely positioned to host a workshop that would motivate interesting discussion and future work of PLM models for IR in both practical use and theoretical research. All papers will be peer reviewed and single-blinded. We welcome many kinds of papers, such as, but not limited to:
Authors should clearly indicate in their abstracts the kinds of submissions that the papers belong to, to help reviewers better understand their contributions.
Submissions must be in PDF, up to 10 pages long (plus unlimited pages for references) — shorter papers are welcome — and formatted according to the standard double-column ACM Proceedings Style.
The accepted papers will be published on the workshop’s website and will not be considered archival for resubmission purposes.
Authors whose papers are accepted to the workshop will have the opportunity to participate in a spotlight and poster session, and some set may also be chosen for oral presentation. For paper submission, please proceed to the submission website.
Please send enquiries to email@example.com
Head of Data Science
New York City, USA
Institute of Computing Technology, CAS
Senior Algorithm Engineer
Senior Algorithm Engineer
Institute of Computing Technology, CAS
Chen Qu (University of Massachusetts Amherst)
Marc Najork (Google)
Daniel Hill (Amazon)
Daniel Cohen (University of Massachusetts Amherst)
Xuanhui Wang (Google)
Liu Yang (MUniversity of Massachusetts Amherst)
Shangsong Liang (Sun Yat-sen University)
Choo Hui (Amazon)
Jun Xu (Renmin University of China)
Keping Bi (University of Massachusetts Amherst)
Wayne Xin-Zhao (Renmin University of China)