首页 def SGNS_process(self): print("Calculating the sim_sgns...") logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s') self.sim_sgns = [] vec_sgns = Word2Vec(LineSentence('text7.txt'), size=100, window=2, sg=1, hs=0, min_count=1, workers=multiprocessing.cpu_count()) tbar = tqdm(total=len(self.word_vector)) for word in self.word_vector: try: self.sim_sgns.append(vec_sgns.wv.similarity(word[0], word[1])) except: self.sim_sgns.append(0) tbar.update(1) tbar.close() # print(self.sim_sgns) f = open('sim_sgns', 'wb') pickle.dump(self.sim_sgns, f) f.close()

def SGNS_process(self): print("Calculating the sim_sgns...") logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s') self.sim_sgns = [] vec_sgns = Word2Vec(LineSentence('text7.txt'), size=100, window=2, sg=1, hs=0, min_count=1, workers=multiprocessing.cpu_count()) tbar = tqdm(total=len(self.word_vector)) for word in self.word_vector: try: self.sim_sgns.append(vec_sgns.wv.similarity(word[0], word[1])) except: self.sim_sgns.append(0) tbar.update(1) tbar.close() # print(self.sim_sgns) f = open('sim_sgns', 'wb') pickle.dump(self.sim_sgns, f) f.close()

时间: 2023-12-15 08:04:27 浏览: 138

这段代码是一个Python函数，名为SGNS_process，用于计算文本中词语的相似度，其中用到了Word2Vec模型。在这个函数中，sg=1是Word2Vec模型中的一个参数，指定了训练算法的类型，这里sg=1表示使用Skip-Gram算法进行训练。Skip-Gram是一种基于神经网络的词向量训练算法，它的基本思想是通过一个中心词预测它周围的上下文词，从而学习到词向量的表示。与之相对的是CBOW算法，它的思想是通过上下文词来预测中心词，Skip-Gram算法相对于CBOW算法更加适用于处理大规模语料库。所以，sg=1的意思是使用Skip-Gram算法进行训练。

阅读全文