TextCNN模型代码实现
时间: 2023-10-07 20:06:19 浏览: 78
以下是基于PyTorch框架实现的TextCNN模型代码,用于文本分类:
```python
import torch
import torch.nn as nn
import torch.nn.functional as F
class TextCNN(nn.Module):
def __init__(self, vocab_size, embedding_dim, num_classes, num_filters, filter_sizes, dropout_prob):
super(TextCNN, self).__init__()
self.embedding = nn.Embedding(vocab_size, embedding_dim)
self.convs = nn.ModuleList([
nn.Conv2d(in_channels=1, out_channels=num_filters, kernel_size=(fs, embedding_dim)) for fs in filter_sizes
])
self.dropout = nn.Dropout(dropout_prob)
self.fc = nn.Linear(num_filters * len(filter_sizes), num_classes)
def forward(self, x):
x = self.embedding(x) # (batch_size, seq_len, embedding_dim)
x = x.unsqueeze(1) # (batch_size, 1, seq_len, embedding_dim)
x = [F.relu(conv(x)).squeeze(3) for conv in self.convs] # [(batch_size, num_filters, seq_len - filter_size + 1), ...]
x = [F.max_pool1d(conv, conv.size(2)).squeeze(2) for conv in x] # [(batch_size, num_filters), ...]
x = torch.cat(x, 1) # (batch_size, num_filters * len(filter_sizes))
x = self.dropout(x)
logits = self.fc(x)
return logits
```
其中,`vocab_size`表示词汇表大小,`embedding_dim`表示词向量维度,`num_classes`表示分类数量,`num_filters`表示卷积核数量,`filter_sizes`表示卷积核尺寸列表,`dropout_prob`表示dropout概率。在`forward`函数中,先使用`embedding`层将输入的词id转换为词向量,然后进行卷积和池化操作,最后通过全连接层输出分类结果。
阅读全文