"GLM:通用语言模型的预训练和自回归填充技术"

需积分: 0 7 下载量 11 浏览量 更新于2024-03-14 收藏 836KB PDF 举报
The General Language Model (GLM) paper introduces a novel pretraining method called General Language Model Pretraining with Autoregressive Blank Infilling. The research team includes Zhengxiao Du, Yujie Qian, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, and Jie Tang from Tsinghua University, Beijing Academy of Artificial Intelligence (BAAI), MIT CSAIL, and Shanghai Qi Zhi Institute. The paper discusses the development of pretraining architectures, highlighting autoencoding models like BERT and autoregressive models like GPT. GLM combines the strengths of both approaches by incorporating autoregressive blank infilling. This method aims to improve the efficiency and effectiveness of language model pretraining by filling in the missing tokens in a sequence in an autoregressive manner. The researchers provide a detailed overview of the GLM architecture, including the components and processes involved in the pretraining phase. They also present experimental results that demonstrate the superior performance of GLM compared to existing pretraining models. The paper concludes with future research directions and potential applications of GLM in various natural language processing tasks. Overall, the GLM paper represents a significant advancement in the field of language model pretraining. By combining the strengths of autoencoding and autoregressive models, GLM offers a more efficient and effective pretraining method that can benefit a wide range of natural language processing applications.