谷歌Universal Sentence Encoder v21：多任务学习的高效 sentence embeddings

需积分: 0 7 浏览量更新于2024-08-05 收藏 586KB PDF 举报

"Universal Sentence Encoder v21是2018年由Google Research团队发布的一项重要研究成果，它专注于多任务学习在自然语言处理（NLP）领域的应用。该研究的目的是开发一种能够将句子编码成向量的模型，这些向量特别设计用于支持跨任务迁移学习，即在不同的NLP任务之间共享和迁移知识。模型的核心是Universal Sentence Encoder，它相较于传统的基于词级别的转移学习（通过预训练词嵌入进行）有显著的优势。这种句子级别的编码模型不仅提高了性能，而且允许在准确性和计算资源之间实现灵活的权衡。研究者提供了两种变体，一种是注重准确性，另一种是在保证一定性能的前提下，对计算资源有更高效的利用。作者们深入研究了模型的复杂性、资源消耗与转移任务训练数据的关系，并进行了详细的实验对比。实验结果显示，使用句子级别的转移学习方法普遍优于仅依赖词级别的方法。这表明，对于NLP任务，句子编码模型能够更有效地捕捉语义和上下文信息，从而提升任务执行的效果。此外，该研究还包含了对没有使用任何转移学习的基线模型的比较，以强调Universal Sentence Encoder在实际任务中的优越性。总体而言，Universal Sentence Encoder v21是自然语言处理领域的一次重要突破，它展示了如何通过多任务学习和高级的句子表示技术，提高NLP任务的泛化能力和效率，为后续的研究和实际应用提供了新的方向。"

Universal Sentence Encoder

Daniel Cer

, Yinfei Yang

, Sheng-yi Kong

, Nan Hua

, Nicole Limtiaco

Rhomni St. John

, Noah Constant

, Mario Guajardo-C

espedes

, Steve Yuan

Chris Tar

, Yun-Hsuan Sung

, Brian Strope

, Ray Kurzweil

Google Research

Mountain View, CA

Google Research

New York, NY

Google

Cambridge, MA

Abstract

We present models for encoding sentences

into embedding vectors that speciﬁcally

target transfer learning to other NLP tasks.

The models are efﬁcient and result in

accurate performance on diverse transfer

tasks. Two variants of the encoding mod-

els allow for trade-offs between accuracy

and compute resources. For both vari-

ants, we investigate and report the rela-

tionship between model complexity, re-

source consumption, the availability of

transfer task training data, and task perfor-

mance. Comparisons are made with base-

lines that use word level transfer learning

via pretrained word embeddings as well

as baselines do not use any transfer learn-

ing. We ﬁnd that transfer learning using

sentence embeddings tends to outperform

word level transfer. With transfer learn-

ing via sentence embeddings, we observe

surprisingly good performance with min-

imal amounts of supervised training data

for a transfer task. We obtain encourag-

ing results on Word Embedding Associ-

ation Tests (WEAT) targeted at detecting

model bias. Our pre-trained sentence en-

coding models are made freely available

for download and on TF Hub.

1 Introduction

Limited amounts of training data are available for

many NLP tasks. This presents a challenge for

data hungry deep learning methods. Given the

high cost of annotating supervised training data,

very large training sets are usually not available

for most research or industry NLP tasks. Many

models address the problem by implicitly per-

forming limited transfer learning through the use

Figure 1: Sentence similarity scores using embed-

dings from the universal sentence encoder.

of pre-trained word embeddings such as those

produced by word2vec (Mikolov et al., 2013) or

GloVe (Pennington et al., 2014). However, recent

work has demonstrated strong transfer task per-

formance using pre-trained sentence level embed-

dings (Conneau et al., 2017).

In this paper, we present two models for produc-

ing sentence embeddings that demonstrate good

transfer to a number of other of other NLP tasks.

We include experiments with varying amounts of

transfer task training data to illustrate the relation-

ship between transfer task performance and train-

ing set size. We ﬁnd that our sentence embeddings

can be used to obtain surprisingly good task per-

formance with remarkably little task speciﬁc train-

ing data. The sentence encoding models are made

publicly available on TF Hub.

Engineering characteristics of models used for

transfer learning are an important consideration.

We discuss modeling trade-offs regarding mem-

ory requirements as well as compute time on CPU

and GPU. Resource consumption comparisons are

made for sentences of varying lengths.

arXiv:1803.11175v2 [cs.CL] 12 Apr 2018

下载后可阅读完整内容，剩余6页未读，立即下载

Period熹微

粉丝: 30
资源: 307

谷歌Universal Sentence Encoder v21：多任务学习的高效 sentence embeddings

在论文“指针网络”中实现了使用神经网络解决TSP问题。https-arxiv.org.abs-1506

可视化-0-清华刘世霞-2018-arXiv-Analyzing the Noise Robustness of Deep Ne

2018-arXiv-Graph Neural Networks A Review of Methods and Applications.pdf

arxiv-metadata-oai-2019.json.zip

Python库 | arxiv-base-0.17.4.tar.gz

Python库 | arxiv-1.1.0.tar.gz

PyPI 官网下载 | arxiv-collector-0.2.0.tar.gz

PyPI 官网下载 | arxiv-vault-0.0.12.tar.gz

PyPI 官网下载 | pygetpapers-0.0.8.7-py3-none-any.whl

最新资源