使用 python 编写一个基于 rnn 的文本分类的程序
时间: 2024-02-25 18:58:31 浏览: 111
python使用RNN实现文本分类
好的,以下是一个基于RNN的文本分类程序的Python代码示例:
```python
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Embedding, LSTM, Bidirectional
# 加载数据集
df = pd.read_csv("data.csv")
X = df["text"]
y = df["label"]
# 划分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 文本预处理
tokenizer = Tokenizer(num_words=10000)
tokenizer.fit_on_texts(X_train)
X_train = tokenizer.texts_to_sequences(X_train)
X_test = tokenizer.texts_to_sequences(X_test)
vocab_size = len(tokenizer.word_index) + 1
maxlen = 100
X_train = pad_sequences(X_train, padding='post', maxlen=maxlen)
X_test = pad_sequences(X_test, padding='post', maxlen=maxlen)
# 构建模型
model = Sequential()
model.add(Embedding(input_dim=vocab_size, output_dim=128, input_length=maxlen))
model.add(Bidirectional(LSTM(64, return_sequences=True)))
model.add(Dropout(0.5))
model.add(Bidirectional(LSTM(32)))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# 训练模型
model.fit(X_train, y_train, epochs=10, validation_data=(X_test, y_test), batch_size=64)
# 评估模型
loss, accuracy = model.evaluate(X_test, y_test, verbose=False)
print("Accuracy: {:.4f}".format(accuracy))
```
这个程序使用了一个双向LSTM层和一个全连接层进行文本分类,训练数据集来自一个名为"data.csv"的文件。你可以将这个程序替换成你自己的数据集和模型架构。
阅读全文