train_test_split函数怎么用
时间: 2023-11-11 12:03:17 浏览: 83
`train_test_split`函数是`scikit-learn`库中的一个函数,用于将数据集划分为训练集和测试集。该函数的使用方法如下:
```python
from sklearn.model_selection import train_test_split
# X为特征数据,y为标签数据,test_size为测试集占比,random_state为随机数种子
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
```
其中,`X`为特征数据,`y`为标签数据,`test_size`为测试集占比,`random_state`为随机数种子。该函数会将`X`和`y`按照`test_size`的比例划分为训练集和测试集,并返回四个数组:`X_train`为训练集特征数据,`X_test`为测试集特征数据,`y_train`为训练集标签数据,`y_test`为测试集标签数据。
下面是一个例子:
```python
import numpy as np
from sklearn.model_selection import train_test_split
X = np.arange(10)
y = [0, 1, 1, 0, 1, 1, 1, 1, 0, 0]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
print("X_train=", X_train)
print("X_test=", X_test)
print("y_train=", y_train)
print("y_test=", y_test)
```
输出结果为:
```
X_train= [4 9 1 6 7 3 0 5]
X_test= [2 8]
y_train= [1, 0, 1, 1, 1, 0, 0, 1]
y_test= [1, 0]
```
阅读全文