用神经网络模型划分训练集和测试集训练数据并可视化处理
时间: 2024-02-03 07:03:09 浏览: 144
这里提供一种基于 Python 和 Scikit-learn 的方法,可以使用神经网络模型划分训练集和测试集,并对数据进行可视化处理。
首先,我们需要准备一些数据,这里以鸢尾花数据集为例:
```python
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data
y = iris.target
```
接下来,我们可以使用 Scikit-learn 的 train_test_split 函数将数据集划分为训练集和测试集:
```python
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
```
其中,test_size 表示测试集所占比例,random_state 表示随机数种子,用于使结果可重现。
接下来,我们可以使用神经网络模型对数据进行训练,这里以 Keras 库为例:
```python
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(10, input_dim=4, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=100, batch_size=10)
```
接着,我们可以使用模型对测试集进行预测,并计算准确率:
```python
from sklearn.metrics import accuracy_score
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred.argmax(axis=1))
print("Accuracy:", accuracy)
```
最后,我们可以使用 Matplotlib 库对数据进行可视化处理,比如绘制数据的散点图:
```python
import matplotlib.pyplot as plt
plt.scatter(X[:, 0], X[:, 1], c=y)
plt.xlabel('Sepal length')
plt.ylabel('Sepal width')
plt.show()
```
完整代码如下:
```python
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from keras.models import Sequential
from keras.layers import Dense
import matplotlib.pyplot as plt
# load data
iris = load_iris()
X = iris.data
y = iris.target
# split data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# define neural network model
model = Sequential()
model.add(Dense(10, input_dim=4, activation='relu'))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# train model
model.fit(X_train, y_train, epochs=100, batch_size=10)
# evaluate model on test set
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred.argmax(axis=1))
print("Accuracy:", accuracy)
# visualize data
plt.scatter(X[:, 0], X[:, 1], c=y)
plt.xlabel('Sepal length')
plt.ylabel('Sepal width')
plt.show()
```
运行代码后,可以看到数据的散点图和模型的准确率。
阅读全文