tensorflow进行文本分类 python代码
时间: 2023-07-24 08:25:47 浏览: 95
人工智能-循环神经网络-CNN-RNN中文文本分类,基于TensorFlow
以下是使用TensorFlow进行文本分类的Python代码示例:
```python
import tensorflow as tf
from tensorflow.keras import layers
# 准备数据
train_text = ['example text 1', 'example text 2', 'example text 3']
train_labels = [0, 1, 1] # 标签为0或1
# 创建tokenizer
tokenizer = tf.keras.preprocessing.text.Tokenizer(num_words=10000)
tokenizer.fit_on_texts(train_text)
# 将文本转换为序列
train_sequences = tokenizer.texts_to_sequences(train_text)
# 对序列进行填充
train_padded_sequences = tf.keras.preprocessing.sequence.pad_sequences(train_sequences, maxlen=50, padding='post')
# 构建模型
model = tf.keras.Sequential([
layers.Embedding(10000, 16, input_length=50),
layers.GlobalAveragePooling1D(),
layers.Dense(16, activation='relu'),
layers.Dense(1, activation='sigmoid')
])
# 编译模型
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
# 训练模型
model.fit(train_padded_sequences, train_labels, epochs=10)
# 预测新数据
test_text = ['example text 4']
test_sequences = tokenizer.texts_to_sequences(test_text)
test_padded_sequences = tf.keras.preprocessing.sequence.pad_sequences(test_sequences, maxlen=50, padding='post')
predictions = model.predict(test_padded_sequences)
```
这个代码示例使用了一个简单的文本分类模型,包括Embedding层,GlobalAveragePooling1D层和两个Dense层。你可以根据自己的需求修改模型结构和超参数。
阅读全文