CNN-BiLSTM python代码
时间: 2023-10-12 21:53:47 浏览: 255
以下是一个使用CNN-BiLSTM模型进行文本分类的Python代码示例:
```python
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential
from keras.layers import Embedding, Conv1D, MaxPooling1D, Bidirectional, LSTM, Dense
# 读取数据集
data = pd.read_csv('data.csv')
# 分离文本和标签
texts = data['text'].values
labels = data['label'].values
# 划分训练集和测试集
train_texts, test_texts, train_labels, test_labels = train_test_split(texts, labels, test_size=0.2, random_state=42)
# 创建词汇表
tokenizer = Tokenizer(num_words=5000)
tokenizer.fit_on_texts(train_texts)
# 将文本转换为数字序列
train_sequences = tokenizer.texts_to_sequences(train_texts)
test_sequences = tokenizer.texts_to_sequences(test_texts)
# 填充序列,使其长度相同
max_sequence_length = max([len(sequence) for sequence in train_sequences])
train_data = pad_sequences(train_sequences, maxlen=max_sequence_length)
test_data = pad_sequences(test_sequences, maxlen=max_sequence_length)
# 构建模型
model = Sequential()
model.add(Embedding(5000, 100, input_length=max_sequence_length))
model.add(Conv1D(128, 5, activation='relu'))
model.add(MaxPooling1D(pool_size=4))
model.add(Bidirectional(LSTM(64)))
model.add(Dense(1, activation='sigmoid'))
# 编译模型
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# 训练模型
model.fit(train_data, train_labels, validation_data=(test_data, test_labels), epochs=10, batch_size=32)
# 评估模型
loss, accuracy = model.evaluate(test_data, test_labels)
print('Test Loss:', loss)
print('Test Accuracy:', accuracy)
```
请注意,这只是一个简单的示例代码,您可能需要根据自己的数据集和任务进行适当的调整和修改。此外,您还需要安装必要的库(如Keras和Sklearn)以运行此代码。
阅读全文