jupyter常用的机器学习代码
时间: 2024-11-12 12:44:35 浏览: 23
Jupyter Notebook是一个非常流行的交互式计算环境,特别适合于数据科学和机器学习任务。在Jupyter上编写机器学习代码通常涉及以下几个步骤:
1. 导入必要的库:如NumPy用于数值运算,Pandas处理数据结构,Matplotlib进行数据可视化,Scikit-learn或TensorFlow等用于机器学习模型。
```python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
```
2. 加载和预处理数据:例如读取CSV文件、清洗数据、编码分类变量等。
```python
data = pd.read_csv('data.csv')
data = data.dropna() # 删除缺失值
X = data[['feature1', 'feature2']] # 特征列
y = data['target'] # 目标变量
```
3. 划分训练集和测试集:
```python
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
```
4. 模型训练:
```python
model = LinearRegression()
model.fit(X_train, y_train)
```
5. 预测和评估:
```python
predictions = model.predict(X_test)
accuracy = model.score(X_test, y_test)
```
6. 可视化结果或保存模型:
```python
plt.scatter(X_test, y_test, color='blue')
plt.plot(X_test, predictions, color='red', linewidth=3)
# 或者保存模型
joblib.dump(model, 'model.pkl')
```
阅读全文