拓展SkipGram模型：依赖关系上下文的应用与效果

需积分: 10 122 浏览量更新于2024-09-08 收藏 1.15MB PDF 举报

"SkipGram模型是一种在自然语言处理中广泛应用的连续词嵌入技术，它最初由Mikolov等人通过引入负采样进行扩展，以捕捉单词之间的非线性上下文关系。传统的SkipGram模型主要依赖于线性上下文，但在实际应用中，这种局限可能导致对语义和功能相似性的理解不足。在这个工作中，作者提出了一种对SkipGram模型的改进，即考虑任意上下文，特别是依赖性上下文。依赖性上下文基于词汇之间的语法和句法关系，如主谓关系、动宾关系等。使用依赖性上下文进行训练后，生成的词嵌入展现出显著的不同特性。这些新的词向量更加关注词汇间的语义关联，而非简单的主题相似性，使得它们在理解词汇的功能性和动态关系上更为准确。相比于原始的SkipGram模型，依赖性词嵌入具有更强的泛化能力，能够更好地处理像“pizza”和“hamburger”这样的例子，即使它们在符号层面上没有直接关联，但通过依赖性上下文可以推断出它们在某些情境下可能存在类似的用途或关联。这种改进对于诸如文本分类、情感分析、机器翻译等任务有着潜在的优势，因为它能提供更深入的语义理解。实验结果表明，依赖性SkipGram模型在多项评估指标上表现优秀，证明了将非线性上下文纳入模型的重要性。然而，这也意味着在实际应用中需要更多的计算资源和更复杂的模型架构。这项工作不仅扩展了词嵌入技术的范畴，也为未来的自然语言处理研究提供了新的视角和可能。"

Dependency-Based Word Embeddings

Omer Levy

∗

and Yoav Goldberg

Computer Science Department

Bar-Ilan University

Ramat-Gan, Israel

{omerlevy,yoav.goldberg}@gmail.com

Abstract

While continuous word embeddings are

gaining popularity, current models are

based solely on linear contexts. In this

work, we generalize the skip-gram model

with negative sampling introduced by

Mikolov et al. to include arbitrary con-

texts. In particular, we perform exper-

iments with dependency-based contexts,

and show that they produce markedly

different embeddings. The dependency-

based embeddings are less topical and ex-

hibit more functional similarity than the

original skip-gram embeddings.

1 Introduction

Word representation is central to natural language

processing. The default approach of represent-

ing words as discrete and distinct symbols is in-

sufﬁcient for many tasks, and suffers from poor

generalization. For example, the symbolic repre-

sentation of the words “pizza” and “hamburger”

are completely unrelated: even if we know that

the word “pizza” is a good argument for the verb

“eat”, we cannot infer that “hamburger” is also

a good argument. We thus seek a representation

that captures semantic and syntactic similarities

between words. A very common paradigm for ac-

quiring such representations is based on the distri-

butional hypothesis of Harris (1954), stating that

words in similar contexts have similar meanings.

Based on the distributional hypothesis, many

methods of deriving word representations were ex-

plored in the NLP community. On one end of the

spectrum, words are grouped into clusters based

on their contexts (Brown et al., 1992; Uszkor-

eit and Brants, 2008). On the other end, words

∗

Supported by the European Community’s Seventh

Framework Programme (FP7/2007-2013) under grant agree-

ment no. 287923 (EXCITEMENT).

are represented as a very high dimensional but

sparse vectors in which each entry is a measure

of the association between the word and a particu-

lar context (see (Turney and Pantel, 2010; Baroni

and Lenci, 2010) for a comprehensive survey).

In some works, the dimensionality of the sparse

word-context vectors is reduced, using techniques

such as SVD (Bullinaria and Levy, 2007) or LDA

(Ritter et al., 2010; S

eaghdha, 2010; Cohen et

al., 2012). Most recently, it has been proposed

to represent words as dense vectors that are de-

rived by various training methods inspired from

neural-network language modeling (Bengio et al.,

2003; Collobert and Weston, 2008; Mnih and

Hinton, 2008; Mikolov et al., 2011; Mikolov et

al., 2013b). These representations, referred to as

“neural embeddings” or “word embeddings”, have

been shown to perform well across a variety of

tasks (Turian et al., 2010; Collobert et al., 2011;

Socher et al., 2011; Al-Rfou et al., 2013).

Word embeddings are easy to work with be-

cause they enable efﬁcient computation of word

similarities through low-dimensional matrix op-

erations. Among the state-of-the-art word-

embedding methods is the skip-gram with nega-

tive sampling model (SKIPGRAM), introduced by

Mikolov et al. (2013b) and implemented in the

word2vec software.

Not only does it produce

useful word representations, but it is also very ef-

ﬁcient to train, works in an online fashion, and

scales well to huge copora (billions of words) as

well as very large word and context vocabularies.

Previous work on neural word embeddings take

the contexts of a word to be its linear context –

words that precede and follow the target word, typ-

ically in a window of k tokens to each side. How-

ever, other types of contexts can be explored too.

In this work, we generalize the SKIP-

GRAM model, and move from linear bag-of-words

contexts to arbitrary word contexts. Speciﬁcally,

code.google.com/p/word2vec/

下载后可阅读完整内容，剩余6页未读，立即下载

喜欢雨天的我

粉丝: 747
资源: 31

拓展SkipGram模型：依赖关系上下文的应用与效果

一文详解 Word2vec 之 Skip-Gram 模型

word2vec Skip-Gram模型的简单实现

用python实现skip-gram算法：AAAI-14 accepted papers（NLP）分类任务

基于java的共享汽车管理系统的开题报告2.docx

基于SpringBoot的ChongyouLostandfound失物招领网站设计源码

小白学JavaScript的第六天

源享科技STM32学习开发板芯片资料74HC595

基于C++和C语言的邻居Toyota日落追逐故事设计源码

CEH2310-VB一种N-Channel沟道SOT23-6封装MOS管

基于STM32F407，使用DFS算法实现最短迷宫路径检索(毕设&课设&实训&大作业&竞赛&项目)

最新资源