首页使用LSTM模型进行微博文本csv文件情感分类的代码

使用LSTM模型进行微博文本csv文件情感分类的代码

时间: 2024-03-13 18:42:45 浏览: 278

以下是使用LSTM模型进行微博文本csv文件情感分类的代码，其中使用了Keras框架和tensorflow作为后端： ``` import pandas as pd import numpy as np from keras.utils import to_categorical from keras.preprocessing.text import Tokenizer from keras.preprocessing.sequence import pad_sequences from keras.models import Sequential from keras.layers import Embedding, LSTM, Dense # 读取csv文件 df = pd.read_csv('weibo.csv') # 将情感标签转化为数字 df['sentiment'] = df['sentiment'].replace({'positive': 1, 'negative': 0}) # 分离文本和情感标签 X = df['review'].values y = df['sentiment'].values # 对文本进行分词和序列化 tokenizer = Tokenizer(num_words=5000) tokenizer.fit_on_texts(X) X = tokenizer.texts_to_sequences(X) # 对序列进行填充 X = pad_sequences(X, maxlen=50) # 对情感标签进行独热编码 y = to_categorical(y) # 构建LSTM模型 model = Sequential() model.add(Embedding(input_dim=5000, output_dim=50, input_length=50)) model.add(LSTM(units=256, dropout=0.2, recurrent_dropout=0.2)) model.add(Dense(units=2, activation='softmax')) # 编译模型 model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) # 训练模型 model.fit(X, y, batch_size=128, epochs=10, validation_split=0.2) # 保存模型 model.save('weibo_lstm.h5') ``` 这段代码首先读取了一个名为weibo.csv的csv文件，并将其中的情感标签（positive或negative）转换为数字（1或0），然后对文本进行分词和序列化，并对情感标签进行独热编码。接着，使用Keras框架构建了一个包含Embedding层、LSTM层和Dense层的模型，并编译了模型。最后，使用训练集对模型进行训练，并将训练好的模型保存为weibo_lstm.h5文件。

阅读全文