DataFrame对象 data.assign() 函数

DataFrame对象是Pandas库中的一种数据结构，类似于Excel中的表格。它由行和列组成，每一列可以是不同的数据类型（例如整数、浮点数、字符串等）。DataFrame对象可以用于数据清洗、数据分析和数据可视化等多个方面。 data.assign()函数是DataFrame对象的一个方法，用于在现有的DataFrame对象中添加新的列。该方法可以接受一个或多个参数，每个参数都是一个新列的名称和该列的数据。例如，以下代码将在名为df的DataFrame对象中添加一个名为new_column的新列，并将该列的值设置为1： ```python import pandas as pd df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}) df = df.assign(new_column=1) ``` 此时，df的值将如下所示： ``` A B new_column 0 1 4 1 1 2 5 1 2 3 6 1 ``` 注意，assign()函数返回一个新的DataFrame对象，因此需要将其赋值给一个变量。如果要在原始DataFrame对象中添加新列，可以使用inplace参数： ```python df.assign(new_column=1, inplace=True) ```

Read the stock price data of Alphabet Inc. (GOOG) for the time period between April 01, 2020 and October 01, 2020 in the file 'alphabet.csv' into a Pandas DataFrame. You can download it from Blackboard in session 12. (1) Convert the “Date” column into Datetime format and assign it as the index column. And output the DataFram containing data for month x. (2) Calculate the net increase of stock price for each day and append them as a new column “Net Increase” to the DataFrame using the closing price minus the opening price. And output the DataFram containing data for month x. (3) Find the day that has the highest opening price, and calculate the mean of closing price for all days. NOTE:You should add the following code to your program to ensure that all columns are displayed when printing.Otherwise, the system will consider it as incorrect. pd.set_option('display.max_columns',1000) pd.set_option('display.width', 1000) pd.set_option('display.max_colwidth',1000)

以下是对应的Python代码实现： ```python import pandas as pd # 读取csv文件 df = pd.read_csv('alphabet.csv') # 将“Date”列转换成Datetime格式，并将其设置为索引列 df['Date'] = pd.to_datetime(df['Date']) df.set_index('Date', inplace=True) # 输出2020年4月的数据 print("Data for April 2020:") print(df['2020-04']) # 计算每日的净涨幅，并将其添加为新列“Net Increase” df['Net Increase'] = df['Close'] - df['Open'] # 输出2020年5月的数据 print("Data for May 2020:") print(df['2020-05']) # 找到开盘价最高的那天，并计算所有交易日的收盘价平均值 max_open = df['Open'].idxmax() mean_close = df['Close'].mean() print(f"The day with the highest opening price was {max_open.date()}, with an opening price of {df.loc[max_open]['Open']:.2f}.") print(f"The mean closing price for all days was {mean_close:.2f}.") # 设置pandas的显示选项，确保所有列都能显示出来 pd.set_option('display.max_columns', 1000) pd.set_option('display.width', 1000) pd.set_option('display.max_colwidth', 1000) ``` 在这个示例中，我们使用 Pandas 的 `read_csv` 函数读取了 `alphabet.csv` 文件中的数据，并将其转换成了 DataFrame 对象。接着，我们将“Date”列转换成了 Datetime 格式，并将其设置为索引列。然后，我们按照题目要求，输出了2020年4月和5月的数据，并计算了每日的净涨幅，并将其添加为新列“Net Increase”。最后，我们找到了开盘价最高的那天，并计算了所有交易日的收盘价平均值，并输出了结果。同时，我们也设置了 Pandas 的显示选项，确保所有列都能显示出来。

pandas模块创建DataFrame对象df_data记录的是所有电影的属性信息，其中此对象的index属性为电影的名字(films列表）和电影的年代，属性columns 分别为所有特征(列表all attrs)，最后对此对象添加一列(列名字为“评分”)。电影名字：肖申克的救赎，霸王别姬。电影属性：犯罪剧情爱情同性。肖申克的救赎 1 0 0 0。霸王别姬 0 1 1 1。评分9.7 9.6。

首先，需要导入pandas模块： ```python import pandas as pd ``` 然后，可以使用pandas的DataFrame函数创建一个DataFrame对象，并设置index属性和columns属性： ```python films = ['肖申克的救赎', '霸王别姬'] all_attrs = ['犯罪', '剧情', '爱情', '同性'] data = [[1, 0, 0, 0], [0, 1, 1, 1]] df_data = pd.DataFrame(data, index=films, columns=all_attrs) ``` 此时，df_data的数据如下： | | 犯罪 | 剧情 | 爱情 | 同性 | |--- |--- |--- |--- |--- | | 肖申克的救赎 | 1 | 0 | 0 | 0 | | 霸王别姬 | 0 | 1 | 1 | 1 | 接下来，可以使用pandas的assign函数在df_data中添加一列，列名为“评分”，并赋值为[9.7, 9.6]： ```python df_data = df_data.assign(评分=[9.7, 9.6]) ``` 最后，df_data的数据如下： | | 犯罪 | 剧情 | 爱情 | 同性 | 评分 | |--- |--- |--- |--- |--- |--- | | 肖申克的救赎 | 1 | 0 | 0 | 0 | 9.7 | | 霸王别姬 | 0 | 1 | 1 | 1 | 9.6 |

阅读全文

DataFrame对象 data.assign() 函数

相关推荐

ES6中新增的Object.assign()方法详解

DataFrame创建数据

pandas.DataFrame 根据条件新建列并赋值的方法

Python如何在DataFrame增加数值

Python通过两个dataframe用for循环求笛卡尔积

【基础】Pandas Series与DataFrame详解

1、创建DataFrame对象dc，dc是一个3行3列的二维数组。 2、为dc增加W列，W列的值都为6。 3、删除dc对象的第2列数值。

怎么将数据类型<class 'pandas.core.series.Series'>转换为<class ‘pandas.core.frame.DataFrame’>

DataFrame 选择列

dataframe 横向累加

DataFrame数据处理

python dataframe添加列

dataframe新增一个空列

在dataframe中添加一列

dataframe中改变一部分数据

往pandas的dataframe里插入行

DataFrame如何添加csv中一列数据

dataframe A 只有一列，一行， dataframe B 有三行， 七列， A和B没有公共列， 要将A插入到B中，做为一列。 怎么做？

最新推荐

Python如何在DataFrame增加数值

stata软件安装包（stata18）（stata软件安装包下载与安装）

探索数据转换实验平台在设备装置中的应用

管理建模和仿真的文件

ggflags包的国际化问题：多语言标签处理与显示的权威指南

如何使用MATLAB实现电力系统潮流计算中的节点导纳矩阵构建和阻抗矩阵转换，并解释这两种矩阵在潮流计算中的作用和差异？

使用git-log-to-tikz.py将Git日志转换为TIKZ图形

"互动学习：行动中的多样性与论文攻读经历"

ggflags包的定制化主题与调色板：个性化数据可视化打造秘籍

如何使用Matlab进行风电场风速模拟，并结合Weibull分布和智能优化算法预测风速？

dataframe A 只有一列，一行， dataframe B 有三行，七列， A和B没有公共列，要将A插入到B中，做为一列。怎么做？