数据预处理Python代码
时间: 2023-12-25 22:27:25 浏览: 106
以下是Python中数据预处理的代码示例:
1.导入标准库
```python
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
```
2.导入数据集
```python
dataset = pd.read_csv('data.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values
```
3.处理缺失数据
```python
from sklearn.impute import SimpleImputer
imputer = SimpleImputer(missing_values=np.nan, strategy='mean')
imputer.fit(X[:,1:3])
X[:, 1:3] = imputer.transform(X[:, 1:3])
```
4.处理分类数据
```python
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.compose import ColumnTransformer
labelencoder_X = LabelEncoder()
X[:, 0] = labelencoder_X.fit_transform(X[:, 0])
ct = ColumnTransformer([("Country", OneHotEncoder(), [0])], remainder='passthrough')
X = ct.fit_transform(X)
labelencoder_y = LabelEncoder()
y = labelencoder_y.fit_transform(y)
```
5.特征缩放
```python
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
X[:, 3:] = sc_X.fit_transform(X[:, 3:])
```
6.小数定标规范化
```python
def deci_sca(data):
new_data=data/(10**(np.ceil(np.log10(data.max()))))
return new_data
```
阅读全文