如何解决np.mean结果出现inf

当使用`np.mean()`计算数组的均值时，如果数组中包含`inf`或`-inf`的值，那么计算结果也会出现`inf`或`-inf`的情况。这是因为在计算时，`inf`或`-inf`会影响均值的计算结果。要解决这个问题，可以使用`np.nanmean()`函数来计算数组的均值。它会忽略`inf`或`-inf`的值，只计算有限值的均值。示例如下： ```python import numpy as np a = np.array([1, 2, 3, np.inf, -np.inf]) mean_a = np.mean(a) print(mean_a) # 输出结果为inf mean_a_without_inf = np.nanmean(a) print(mean_a_without_inf) # 输出结果为2.0 ``` 在上面的示例中，`a`数组中包含`inf`和`-inf`的值，使用`np.mean()`计算均值时，结果为`inf`。而使用`np.nanmean()`计算均值时，会忽略`inf`和`-inf`的值，只计算有限值的均值，结果为`2.0`。

def Sta_inf(data): print('_min',np.min(data)) print('_max:',np.max(data)) print('_mean',np.mean(data)) print('_ptp',np.ptp(data)) print('_std',np.std(data)) print('_var',np.var(data)) 1 print('Sta of label:') Sta_inf(Y_data)改为pyspark语句

Assuming that the data is stored in a PySpark DataFrame called "df" and the label column is called "label": ```python from pyspark.sql.functions import min, max, mean, stddev, varience # Calculate statistics of data df.select([min(col), max(col), mean(col), stddev(col), varience(col)]).show() # Calculate statistics of label df.select([min('label'), max('label'), mean('label'), stddev('label'), varience('label')]).show() ``` Note that the "col" variable should be replaced with the name of the column containing the data you want to analyze.

import numpy as np from scipy.stats import f 构造数据集 X = np.array([[1, 7, 26, 6, 60], [1, 1, 29, 15, 52], [1, 11, 56, 8, 20], [1, 11, 31, 8, 47], [1, 7, 52, 6, 33], [1, 11, 55, 9, 22], [1, 3, 71, 17, 6], [1, 1, 31, 22, 44], [1, 2, 54, 18, 22], [1, 21, 47, 4, 26], [1, 1, 40, 23, 34], [1, 11, 66, 9, 12], [1, 10, 68, 8, 12]]) Y = np.array([78.5, 74.3, 104.3, 87.6, 95.9, 109.2, 102.7, 72.5, 93.1, 115.9, 83.8, 113.3, 109.4]) 求解回归系数 beta = np.linalg.inv(X.T @ X) @ X.T @ Y 输出回归结果 print('回归系数：', beta) 求解残差平方和和总平方和 Y_pred = X @ beta SSE = np.sum((Y - Y_pred) 2) SST = np.sum((Y - np.mean(Y)) 2) 计算R平方和调整R平方 R2 = 1 - SSE / SST adj_R2 = 1 - SSE / (len(Y) - len(beta) - 1) / SST print('R平方：', R2) print('调整R平方：', adj_R2) 进行方差分析 MSR = np.sum((Y_pred - np.mean(Y)) ** 2) / (len(beta) - 1) MSE = SSE / (len(Y) - len(beta)) F = MSR / MSE p = 1 - f.cdf(F, len(beta) - 1, len(Y) - len(beta)) print('F值：', F) print('p值：', p) 你能接着上面的代码用全子集法求最优回归方程，请写出完整的py程序

import numpy as np from itertools import combinations from scipy.stats import f # 构造数据集 X = np.array([[1, 7, 26, 6, 60], [1, 1, 29, 15, 52], [1, 11, 56, 8, 20], [1, 11, 31, 8, 47], [1, 7, 52, 6, 33], [1, 11, 55, 9, 22], [1, 3, 71, 17, 6], [1, 1, 31, 22, 44], [1, 2, 54, 18, 22], [1, 21, 47, 4, 26], [1, 1, 40, 23, 34], [1, 11, 66, 9, 12], [1, 10, 68, 8, 12]]) Y = np.array([78.5, 74.3, 104.3, 87.6, 95.9, 109.2, 102.7, 72.5, 93.1, 115.9, 83.8, 113.3, 109.4]) # 全子集法求最优回归方程 n_features = X.shape[1] best_score = float('-inf') best_feature_idx = None for k in range(1, n_features+1): for subset in combinations(range(n_features), k): X_subset = X[:, subset] beta_subset = np.linalg.inv(X_subset.T @ X_subset) @ X_subset.T @ Y Y_pred_subset = X_subset @ beta_subset SSE_subset = np.sum((Y - Y_pred_subset) ** 2) SST_subset = np.sum((Y - np.mean(Y)) ** 2) R2_subset = 1 - SSE_subset / SST_subset if R2_subset > best_score: best_score = R2_subset best_feature_idx = subset # 输出最优回归方程 print('最优回归方程的特征索引：', best_feature_idx) X_best = X[:, best_feature_idx] beta_best = np.linalg.inv(X_best.T @ X_best) @ X_best.T @ Y print('最优回归方程的系数：', beta_best) # 求解残差平方和和总平方和 Y_pred = X_best @ beta_best SSE = np.sum((Y - Y_pred) ** 2) SST = np.sum((Y - np.mean(Y)) ** 2) # 计算R平方和调整R平方 R2 = 1 - SSE / SST adj_R2 = 1 - SSE / (len(Y) - len(beta_best) - 1) / SST print('R平方：', R2) print('调整R平方：', adj_R2) # 进行方差分析 MSR = np.sum((Y_pred - np.mean(Y)) ** 2) / (len(beta_best) - 1) MSE = SSE / (len(Y) - len(beta_best)) F = MSR / MSE p = 1 - f.cdf(F, len(beta_best) - 1, len(Y) - len(beta_best)) print('F值：', F) print('p值：', p)

如何解决np.mean结果出现inf

def Sta_inf(data): print('_min',np.min(data)) print('_max:',np.max(data)) print('_mean',np.mean(data)) print('_ptp',np.ptp(data)) print('_std',np.std(data)) print('_var',np.var(data)) 1 print('Sta of label:') Sta_inf(Y_data)改为pyspark语句

相关推荐

np.mean np.cov numpy.corrcoef pyplot.scatter pyplot.contour函数

python多项式拟合之np.polyfit 和 np.polyld详解

Python Numpy:找到list中的np.nan值方法

module 'numpy' has no attribute 'int'. Did you mean: 'inf'?

AttributeError: module 'numpy' has no attribute 'int'. Did you mean: 'inf'?

/tmp/ipykernel_3164/1425456350.py:11: RuntimeWarning: divide by zero encountered in double_scalars psnr = 10 * np.log10((255 ** 2) / mse)

最新推荐

计算机基础知识试题与解答

管理建模和仿真的文件

【进阶】音频处理基础：使用Librosa

设置ansible 开机自启

计算机基础知识试题与解析

"互动学习：行动中的多样性与论文攻读经历"

【基础】网络编程入门：使用HTTP协议

时间序列大模型的研究进展

计算机基础知识试题与解析

关系数据表示学习