使用torch如何对知识表示学习模型RotatE进行修改，使之输出自对抗负采样损失，而不保留实体关系嵌入向量

时间: 2024-01-12 20:02:28 浏览: 68

解决pytorch 交叉熵损失输出为负数的问题

网络训练中，loss曲线非常奇怪交叉熵怎么会有负数。经过排查，交叉熵不是有个负对数吗，当网络输出的概率是0-1时，正数。可当网络输出大于1的数，就有可能变成负数。所以加上一行就行了 out1 = F.softmax(out1, dim=1) 补充知识：在pytorch框架下，训练model过程中，loss=nan问题时该怎么解决？当我在UCF-101数据集训练alexnet时，epoch设为100，跑到三十多个epoch时，出现了loss=nan问题，当时是一脸懵逼，在查阅资料后，我通过减小学习率解决了问题，现总结一下出现这个问题的可能原因及解决方法： 1. 减小整体学习率。学习在PyTorch中，交叉熵损失（Cross-Entropy Loss）是一种常见的损失函数，常用于分类任务，它结合了softmax激活函数和对数损失。当网络的输出层未使用softmax进行归一化时，可能会出现负的损失值。这是因为交叉熵损失计算公式中涉及到了负对数，如果网络的预测概率值超出[0, 1]范围，计算得到的对数值就会变为负数。在描述中提到的问题，解决方案是确保在计算交叉熵损失之前，先对网络的输出应用`F.softmax()`函数，这会将输出转换为概率分布，所有元素的和为1，且每个元素都在0到1之间。修改后的代码应该类似这样： ```python out1 = F.softmax(out1, dim=1) loss = F.cross_entropy(out1, target) ``` 关于训练过程中遇到的loss为`nan`的情况，这通常意味着损失函数的值变得无法计算，可能是由于梯度爆炸或数值不稳定性导致的。以下是一些处理这类问题的方法： 1. **减小学习率**：过大的学习率可能导致参数更新过大，跳过了最优解。降低学习率可以帮助参数更平稳地接近最小值。 2. **改变网络宽度**：网络中某些层的权重更新异常时，可以尝试增加这些层的宽度，以提高模型的表达能力。 3. **层间学习率调整**：每个层可以设置独立的学习率。如果后层出现问题，可以尝试降低这些层的学习率，让它们更新得更慢。 4. **数据预处理**：包括数据归一化，如减去均值、除以标准差，或者使用批量归一化（Batch Normalization，BN）等技术，有助于稳定训练过程。 5. **梯度裁剪（Gradient Clipping）**：限制梯度的范数，防止梯度爆炸，保持训练的稳定性。 6. **检查输入数据**：确保输入数据没有脏数据，比如`NaN`值。真实业务数据中可能出现这类问题，需要在输入数据预处理阶段进行清理。理解并妥善处理这些问题对于优化PyTorch模型的训练过程至关重要。同时，了解交叉熵损失的计算原理以及如何正确使用它，可以避免不必要的错误，提升模型训练的效果。在实践中，通常还需要结合模型结构、优化器选择、学习率调度策略等多种因素，进行综合优化。

在使用torch对知识表示学习模型RotatE进行修改，使之输出自对抗负采样损失而不保留实体关系嵌入向量时，可以遵循以下步骤： 1. 定义模型结构首先，我们需要定义一个新的模型结构来实现这个任务。可以参考RotatE模型的代码来定义新的模型结构，需要将实体和关系嵌入向量的权重从模型中移除，并添加自对抗负采样损失函数。这里以在RotatE的基础上对模型进行修改为例： ```python import torch import torch.nn as nn from torch.autograd import Variable class RotatE_AutoNeg(nn.Module): def __init__(self, nentity, nrelation, hidden_dim, gamma): super(RotatE_AutoNeg, self).__init__() self.nentity = nentity self.nrelation = nrelation self.hidden_dim = hidden_dim self.gamma = gamma self.embedding_range = nn.Parameter( torch.Tensor([(self.gamma + 2.0) / (self.hidden_dim * 2)]), requires_grad=False) self.entity_emb = nn.Embedding(self.nentity, self.hidden_dim) self.relation_emb = nn.Parameter(torch.Tensor(self.nrelation, self.hidden_dim)) nn.init.uniform_( tensor=self.entity_emb.weight.data, a=-self.embedding_range.item(), b=self.embedding_range.item() ) nn.init.uniform_( tensor=self.relation_emb.data, a=-self.embedding_range.item(), b=self.embedding_range.item() ) def _calc(self, h, t, r): # Calculate rotated complex embeddings re_head, im_head = torch.chunk(h, 2, dim=-1) re_tail, im_tail = torch.chunk(t, 2, dim=-1) re_relation, im_relation = torch.chunk(r, 2, dim=-1) re_head = torch.unsqueeze(re_head, dim=-1) im_head = torch.unsqueeze(im_head, dim=-1) re_tail = torch.unsqueeze(re_tail, dim=-1) im_tail = torch.unsqueeze(im_tail, dim=-1) # Perform rotation re_h = re_head * re_relation - im_head * im_relation im_h = re_head * im_relation + im_head * re_relation re_t = re_tail * re_relation + im_tail * im_relation im_t = -re_tail * im_relation + im_tail * re_relation # Concatenate real and imaginary part of embeddings h = torch.cat([re_h, im_h], dim=-1) t = torch.cat([re_t, im_t], dim=-1) return h, t def forward(self, pos_h, pos_t, pos_r, neg_h, neg_t, neg_r): # Positive triple score pos_h_emb = self.entity_emb(pos_h) pos_t_emb = self.entity_emb(pos_t) pos_r_emb = self.relation_emb(pos_r) pos_h_emb, pos_t_emb = self._calc(pos_h_emb, pos_t_emb, pos_r_emb) pos_score = torch.norm(pos_h_emb + pos_r_emb - pos_t_emb, p=2, dim=-1) # Negative triple score neg_h_emb = self.entity_emb(neg_h) neg_t_emb = self.entity_emb(neg_t) neg_r_emb = self.relation_emb(neg_r) neg_h_emb, neg_t_emb = self._calc(neg_h_emb, neg_t_emb, neg_r_emb) neg_score = torch.norm(neg_h_emb + neg_r_emb - neg_t_emb, p=2, dim=-1) # Self-adversarial negative sampling loss auto_neg_score = torch.norm(neg_h_emb + pos_r_emb - pos_t_emb, p=2, dim=-1) return pos_score, neg_score, auto_neg_score ``` 2. 定义损失函数接下来，我们需要定义损失函数。这里使用自对抗负采样损失函数，需要计算正样本和自动生成的负样本之间的相似度，并将其作为损失函数的一部分。具体实现可以参考以下代码： ```python criterion = nn.MarginRankingLoss(margin=1.0) pos_score, neg_score, auto_neg_score = model(pos_h, pos_t, pos_r, neg_h, neg_t, neg_r) auto_neg_target = Variable(torch.Tensor([-1])).cuda() loss = criterion(pos_score - neg_score, auto_neg_score, auto_neg_target) ``` 这里使用了MarginRankingLoss作为损失函数，其中pos_score-neg_score表示正样本和负样本之间的相似度，auto_neg_score是自动生成的负样本和正样本之间的相似度，auto_neg_target是一个固定值-1，表示自动生成的负样本要与正样本距离更远。 3. 训练模型最后，我们需要训练模型，更新模型参数。可以使用Adam等优化器来更新模型参数，具体实现可以参考以下代码： ```python optimizer = torch.optim.Adam(model.parameters(), lr=0.001) for epoch in range(100): for pos_h, pos_t, pos_r, neg_h, neg_t, neg_r in dataloader: pos_h = pos_h.cuda() pos_t = pos_t.cuda() pos_r = pos_r.cuda() neg_h = neg_h.cuda() neg_t = neg_t.cuda() neg_r = neg_r.cuda() optimizer.zero_grad() pos_score, neg_score, auto_neg_score = model(pos_h, pos_t, pos_r, neg_h, neg_t, neg_r) auto_neg_target = Variable(torch.Tensor([-1])).cuda() loss = criterion(pos_score - neg_score, auto_neg_score, auto_neg_target) loss.backward() optimizer.step() print('Epoch %d, loss %.4f' % (epoch, loss.item())) ``` 这里使用了Adam优化器来更新模型参数，每个epoch遍历一次数据集，计算损失函数并进行反向传播和优化。这样就完成了使用torch对知识表示学习模型RotatE进行修改，使之输出自对抗负采样损失而不保留实体关系嵌入向量的过程。

阅读全文

使用torch如何对知识表示学习模型RotatE进行修改，使之输出自对抗负采样损失，而不保留实体关系嵌入向量

相关推荐

Pytorch深度学习（5） — 词向量及Glove预训练模型使用

使用torch对知识表示学习模型RotatE进行修改，使之输出自对抗负采样损失（NSSALoss），而不保留实体关系嵌入向量

迁移学习实战宝典：构建高效文本分类模型的6个步骤

【神经网络与反向传播】：构建深度网络模型，深度探索监督学习

【训练故障排除】：解决损失函数不收敛的10个实战策略

BP神经网络预测优化秘诀：提升模型准确度和效率

迁移学习领域自适应：理论、方法与3个实际应用实例

深入监督学习：过拟合与欠拟合的识别与处理策略

深度度量学习揭秘：原理、应用与实战，助你提升相似度计算能力

声发射定位算法 Matlab 仿真项目源码+文档说明（高分项目）

Monkey测试，推包文件

【中科院1区】Matlab实现向量加权平均算法INFO-RF锂电池健康状态估计算法研究.rar

【源码+数据库脚本+项目讲解】基于JavaWeb+mysql实现的企业电子商城

计算机图形学之动画和模拟算法：CrowdSimulation：碰撞检测与响应.docx

LOL v2数据集，train和test

logistic regression.ipynb

暴风电视刷机数据 55R5 屏LC550EGY-SJM2 机编60000AM0S00 屏参30173304 V4.0.32版本

最新推荐

在C++中加载TorchScript模型的方法

Pytorch中torch.nn的损失函数

pytorch sampler对数据进行采样的实现

使用pytorch搭建AlexNet操作(微调预训练模型及手动搭建)

Pytorch提取模型特征向量保存至csv的例子

Android圆角进度条控件的设计与应用

管理建模和仿真的文件

【R语言lattice包实战】：从案例到技巧，图形制作不再难

输入正整数n.打出长度为n的菱形

mui框架实现带侧边栏的响应式布局