x1=np.array(DATA.loc[,'单价（元/平米）'])

在这个代码中，`DATA` 是一个 DataFrame 对象，`x1` 是一个 numpy 数组。`np.array()` 函数用于将 `DATA.loc[,'单价（元/平米）']` 的结果转换为 numpy 数组。`DATA.loc[,'单价（元/平米）']` 表示从 DataFrame `DATA` 中选择 `单价（元/平米）` 这一列数据。所以，`x1=np.array(DATA.loc[,'单价（元/平米）'])` 的作用是将 DataFrame `DATA` 中的 `单价（元/平米）` 这一列数据转换为 numpy 数组并赋值给变量 `x1`。

import numpy as np import pandas as pd import matplotlib.pyplot as plt plt.rcParams['font.sans-serif']=["SimHei"] #单使用会使负号显示错误 plt.rcParams['axes.unicode_minus']=False #把负号正常显示 # 读取北京房价数据 path='data.txt' data=pd.read_csv(path,header=None,names=['mianji','jiage']) # data.head() # data.describe() # 绘制散点图 data.plot(kind='scatter',x='mianji',y='jiage') plt.show() def computeCost(X,y,theta): inner=np.power((Xtheta.T),2) return np.sum(inner)/(2len(X)) data.insert(0,'Ones',1) clos=data.shape[1] X=data.iloc[:,0:clos-1] y=data.iloc[:,clos-1:clos] X=np.array(X.values) y=np.array(y.values) theta=np.array[0,0] computeCost(X,y,theta) def gradientDescent(X,y,theta,alpha,iters): temp=np.array(np.zeros(theta.shape)) parameters=int(theta.ravel().shape[1]) cost=np.zeros(iters) for i in range(iters): error=(Xtheta.T)-y for j in range(parameters): term=np.multiply(error,X[:,j]) temp[0,j]=theta[0,j]-((alpha/len(X))np.sum(term)) theta=temp cost[i]=computeCost(X,y,theta) return theta,cost alpha=0.01 iters=1000 g,cost=gradientDescent(X,y,theta,alpha,iters) x=np.linspace(data.mianji.min(),data.mianji.max(),100) f=g[0,0]+(g[0,1]*x) fig,ax=plt.subplots(figsize=(12,8)) ax.plot(x,f,'r',label='北京房价') ax.scatter(data.mianji,data.jiage,label='Traning data') ax.legend(loc=4) ax.set_xlabel('房子面积') ax.set_ylabel('房子价格') ax.set_title("北京房价格回归图") plt.show()

在这段代码中，有几个错误： 1. 第一行和第二行之间应该有一个空行。 2. 导入numpy和pandas模块时，应该使用import语句，每个模块之间应该有一个空行。 3. 在第10行和11行中，应该将plt.rcParams的设置放在import语句之后。 4. 在第14行中，应该将data.insert(0,'Ones',1)改为data.insert(0,'Ones',np.ones(len(data)))，以便将1插入到整个列中。 5. 在第16行中，应该将clos改为cols，因为该变量代表“列数”。 6. 在第18行中，应该将theta=np.array[0,0]改为theta=np.array([0,0])，以便创建一个2个元素的数组。 7. 在第25行和26行之间应该有一个空行。 8. 在第27行中，应该将parameters=int(theta.ravel().shape[0])改为parameters=int(theta.ravel().shape[1])，因为该数组是一个2x1的数组。 9. 在第36行中，应该将g,cost=gradientDescent(X,y,theta,alpha,iters)改为g,cost=gradientDescent(X,y,theta,alpha,int(iters))，以确保iters是一个整数。 10. 在第38行和39行之间应该有一个空行。以下是修正后的代码： ``` import numpy as np import pandas as pd import matplotlib.pyplot as plt plt.rcParams['font.sans-serif'] = ["SimHei"] # 单使用会使负号显示错误 plt.rcParams['axes.unicode_minus'] = False # 把负号正常显示 # 读取北京房价数据 path = 'data.txt' data = pd.read_csv(path, header=None, names=['mianji', 'jiage']) # data.head() # data.describe() # 绘制散点图 data.plot(kind='scatter', x='mianji', y='jiage') plt.show() def computeCost(X, y, theta): inner = np.power((X * theta.T - y), 2) return np.sum(inner) / (2 * len(X)) data.insert(0, 'Ones', np.ones(len(data))) # 将1插入到整个列中 cols = data.shape[1] X = data.iloc[:, 0:cols-1] y = data.iloc[:, cols-1:cols] X = np.array(X.values) y = np.array(y.values) theta = np.array([0, 0]) # 创建一个2个元素的数组 computeCost(X, y, theta) def gradientDescent(X, y, theta, alpha, iters): temp = np.array(np.zeros(theta.shape)) parameters = int(theta.ravel().shape[1]) cost = np.zeros(iters) for i in range(iters): error = (X * theta.T) - y for j in range(parameters): term = np.multiply(error, X[:, j]) temp[0, j] = theta[0, j] - ((alpha / len(X)) * np.sum(term)) theta = temp cost[i] = computeCost(X, y, theta) return theta, cost alpha = 0.01 iters = 1000 g, cost = gradientDescent(X, y, theta, alpha, int(iters)) # 确保iters是一个整数 x = np.linspace(data.mianji.min(), data.mianji.max(), 100) f = g[0, 0] + (g[0, 1] * x) fig, ax = plt.subplots(figsize=(12, 8)) ax.plot(x, f, 'r', label='北京房价') ax.scatter(data.mianji, data.jiage, label='Traning data') ax.legend(loc=4) ax.set_xlabel('房子面积') ax.set_ylabel('房子价格') ax.set_title("北京房价格回归图") plt.show() ```

1. 给定 6 个样本，试用 parzen 窗方法估计它们的概率密度 x1 = np.array([-2.1, -1.3, -0.4, 1.9, 5.1, 6.2]) x2 = np.array([3.2, 3.6, 3, 6, 2.5, 1.1]) 2. 生成10000个均值为0，方差为1的一维正态分布样本。在样本集中随机选择1、16、256 个样本，使用Parzen窗估计概率密度。比较窗宽h1取不同值时的估计结果。

对于第一个问题，我们可以使用 parzen 窗方法来估计给定样本的概率密度。parzen 窗方法是一种非参数概率密度估计方法，它基于窗函数在样本点上的加权平均来估计概率密度。首先，我们需要定义一个窗函数。在这里，我们选择使用高斯窗函数。 ```python import numpy as np from scipy.stats import norm def parzen_density_estimate(x, data, h): n_samples = len(data) d = x.shape[0] window = norm(loc=0, scale=h) density = 0 for i in range(n_samples): density += window.pdf((x - data[i]) / h) density /= n_samples * h**d return density ``` 现在，我们可以使用该函数来估计给定样本的概率密度。 ```python x1 = np.array([-2.1, -1.3, -0.4, 1.9, 5.1, 6.2]) x2 = np.array([3.2, 3.6, 3, 6, 2.5, 1.1]) h = 1 # 窗宽 for i in range(len(x1)): density = parzen_density_estimate(x1[i], x1, h) print(f"The density estimate for x1[{i}] is: {density}") for i in range(len(x2)): density = parzen_density_estimate(x2[i], x2, h) print(f"The density estimate for x2[{i}] is: {density}") ``` 对于第二个问题，我们需要生成一维正态分布样本，然后使用 parzen 窗方法来估计概率密度。 ```python import matplotlib.pyplot as plt # 生成一维正态分布样本 np.random.seed(0) samples = np.random.normal(loc=0, scale=1, size=10000) # 随机选择样本 n_samples = [1, 16, 256] h_values = [0.1, 0.5, 1, 2] # 不同的窗宽 for n in n_samples: selected_samples = np.random.choice(samples, size=n, replace=False) for h in h_values: densities = [parzen_density_estimate(x, selected_samples, h) for x in samples] plt.plot(samples, densities, label=f"h={h}") plt.title(f"Parzen density estimation for {n} samples") plt.legend() plt.show() ``` 这段代码会生成三个子图，每个子图都显示了不同窗宽下的概率密度估计结果。

阅读全文

x1=np.array(DATA.loc[,'单价（元/平米）'])

相关推荐

商品查找，单价计算

python pandas.DataFrame.loc函数使用详解

np.random一系列(np.random.normal()、np.random.randint、np.random.randn、np.random.rand)

Python数据科学速查表 Pandas 进阶.pdf

【假设检验深入解析】：使用Scipy.stats进行高效科学推断

【统计分析新手必看】：利用Scipy.stats快速掌握数据探索技巧

【多元统计分析精进】：Scipy.stats在多变量数据集中的高效应用

XC7V2000T与TMS320C6678设计文件全解析：含原理图、PCB及验证可直接生产使用,XC7V2000T与TMS320C6678设计文件详解：原理图、PCB等全囊括，验证合格，投入生产准备就

高质量男女性别男女分类数据集340张（已划分训练集与验证集）.zip

Carbon storage in China’s forest ecosystems estimation by different integrative methods.pdf

算法小白必读！C语言实现排序与查找：冒泡、快排、二分法全解析.pdf

超强大微信小程序源码-内含几十款功能王者战力查询.zip

大家在看

chfenger-Waverider-master0_乘波体_

冲击波在水深方向传播规律数值仿真研究模型文件

测量变频损耗L的方框图如图-所示。-微波电路实验讲义

毕业论文jsp529图书借阅管理系统(sqlserver).doc

基于MATLAB的表面裂纹识别与检测

最新推荐

XC7V2000T与TMS320C6678设计文件全解析：含原理图、PCB及验证可直接生产使用,XC7V2000T与TMS320C6678设计文件详解：原理图、PCB等全囊括，验证合格，投入生产准备就

高质量男女性别男女分类数据集340张（已划分训练集与验证集）.zip

Carbon storage in China’s forest ecosystems estimation by different integrative methods.pdf

算法小白必读！C语言实现排序与查找：冒泡、快排、二分法全解析.pdf

超强大微信小程序源码-内含几十款功能王者战力查询.zip

CentOS 6下Percona XtraBackup RPM安装指南

【K-means与ISODATA算法对比】：聚类分析中的经典与创新

jupyter notebook没有opencv

QandAs问卷平台：基于React和Koa的在线调查工具

RLE编码与解码原理：揭秘BMP图像处理的关键步骤，提升解码效率