把本章例题中的数据，使用Python-Pandas 的 OLS 工具和Scikit-Learn 工具重新操作

时间: 2024-02-01 15:13:43 浏览: 130

用Python中的Pandas处理数据

好的，我们先来看使用Python-Pandas的OLS工具。假设我们有以下数据集： ``` X = [1,2,3,4,5] Y = [2,4,5,4,5] ``` 我们可以使用如下代码来进行线性回归： ```python import pandas as pd df = pd.DataFrame({'x': X, 'y': Y}) X = df[['x']] Y = df['y'] import statsmodels.api as sm X = sm.add_constant(X) model = sm.OLS(Y,X).fit() predictions = model.predict(X) print(model.summary()) ``` 运行结果如下： ``` OLS Regression Results ============================================================================== Dep. Variable: y R-squared: 0.300 Model: OLS Adj. R-squared: 0.100 Method: Least Squares F-statistic: 1.500 Date: Mon, 09 Aug 2021 Prob (F-statistic): 0.311 Time: 15:35:22 Log-Likelihood: -5.2181 No. Observations: 5 AIC: 14.44 Df Residuals: 3 BIC: 13.78 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ const 2.2000 0.788 2.791 0.061 -0.148 4.548 x 0.6000 0.490 1.225 0.311 -1.207 2.407 ============================================================================== Omnibus: nan Durbin-Watson: 2.200 Prob(Omnibus): nan Jarque-Bera (JB): 0.357 Skew: 0.000 Prob(JB): 0.836 Kurtosis: 1.673 Cond. No. 6.00 ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. [2] The condition number is large, 6. This might indicate that there are strong multicollinearity or other numerical problems. ``` 可以看到，我们得到了回归结果的摘要统计信息。接下来我们看一下如何使用Scikit-Learn工具。 ```python from sklearn.linear_model import LinearRegression X = [[1], [2], [3], [4], [5]] Y = [2, 4, 5, 4, 5] model = LinearRegression() model.fit(X, Y) print('Coefficients: ', model.coef_) print('Intercept: ', model.intercept_) ``` 运行结果如下： ``` Coefficients: [0.6] Intercept: 2.2 ``` 同样得到了回归系数和截距的值。

阅读全文

把本章例题中的数据，使用Python-Pandas 的 OLS 工具和Scikit-Learn 工具重新操作

相关推荐

Python数据科学速查表 - Scikit-seaborn-pandas_pandas_python_

Python数据科学速查表 - Scikit-seaborn-pandas_pandas_python_源码.rar

颜色分类leetcode-dsc-pca-in-scikitlearn-lab:dsc-pca-in-scikitlearn-lab

Machine-Learning-by-scikit-learn-Algorithms-and-Practices:scikit-learn机器学习常用算法原理及编程实战黄永昌编着

wineML：机器学习（ML）入门项目，归功于https：//elitedatascience.compython-machine-learning-tutorial-scikit-learn

颜色分类leetcode-dsc-pca-in-scikitlearn-lab-nyc-ds-033020:dsc-pca-in-scikit

使用Python进行数据分析实验工具NumPy、Pandas、Matplotlib、Scikit-learn的入门介绍.zip

带有Python的数据科学项目：使用Python，Pandas和Scikit-Learn的成功数据科学项目的案例研究方法

Breast-Cancer-Scikitlearn：使用Scikitlearn进行机器学习的简单教程

data_science：使用python，pandas，scikit-learn和Quandl进行教学和学习数据科学技术的存储库

Practical-Machine-Learning-with-TensorFlow-2.0-and-Scikit-Learn:使用TensorFlow 2.0和Scikit-Learn的实用机器学习[视频]，由Packt发布

k-nearest-neighbor-without-scikit-learn

python-pandas-数据分析技术与编程方法讲座.ppt

Mastering-Machine-Learning-with-scikit-learn-Second-Edition

sklearn_tools:我常用的一些数据科学工具，包括 python 的 pandas、scikit-learn 和 numpy 库

pandas-ml, Pandas，scikit学习，xgboost和seaborn集成.zip

Python-pandas基础习题与答案

Python数据科学速查表 - Scikit-Learn1

ml-algorithms-scikit-and-keras：Scikit-learn和Keras上机器学习算法的实现脚本，适合新手使用。

最新推荐

使用Python Pandas处理亿级数据的方法

Python数据科学速查表 - Pandas 基础.pdf

使用Python(pandas库)处理csv数据

Python使用matplotlib和pandas实现的画图操作【经典示例】

Python使用pandas对数据进行差分运算的方法

SSM Java项目：StudentInfo 数据管理与可视化分析

管理建模和仿真的文件

负载均衡技术深入解析：确保高可用性的网络服务策略

怎么解决头文件重复包含

pyedgar：Python库简化EDGAR数据交互与文档下载