Pandas_task06 综合练习综合练习
第六次第六次-大作业大作业
一、一、2002 年年-2018 年上海机动车拍照拍卖年上海机动车拍照拍卖
df_car = pd.read_csv('数据集/2002年-2018年上海机动车拍照拍卖.csv')
##1
df = df_car.copy()
df['percent'] = df.apply(lambda x:x['Total number of license issued']/x['Total number of applicants'],axis=1)
for index, row in df.iterrows():
if row['percent']<0.05:
print(row['Date'])
break
##2
groupedyear = dfym.groupby('年份')
def f(df):
data = OrderedDict()
data['LP_max'] = df["lowest price "].max()
data['LP_mean'] = df['lowest price '].mean()
data['LP_075'] = df['lowest price '].quantile(q=0.75)
return pd.Series(data)
groupedyear.apply(f)
##3
df = df_car.copy()
df['年份'] = df['Date'].apply(lambda x : 2000+int(x.split('-')[0]))
df['月份'] = df['Date'].apply(lambda x : x.split('-')[1])
newcolumns = ['年份','月份']+list(df.columns[1:-2])
dfym = df.reindex(columns=newcolumns).copy()
dfym.head()
##4
Month = dfym.iloc[0:12,1].to_list()
result = dfym.melt(id_vars=['年份','月份'],value_vars=['Total number of license issued','lowest price ','avg price','Total number of applicants'],value_name='info')
result.pivot_table(index = ['年份','variable'],columns='月份',values='info',fill_value='-').reindex(columns = Month)
##5
print('[最低价、均值]与上月差额不同号的有:')
for index in dfym.index:
try:
signal = (dfym.loc[index,'lowest price ']-dfym.loc[index+1,'lowest price '])*\
(dfym.loc[index,'avg price'] -dfym.loc[index+1,'avg price'])
if signal<0:
print(dfym.loc[index+1,['年份','月份']])
print('')
except:
break
##6
df6 = dfym.copy()
df6['发行增益']=0
for index in df6.index:
if index<2:continue
df6.loc[index,'发行增益']= df6.loc[index,'Total number of license issued']-(df6.loc[index-1,'Total number of license issued']+df6.loc[index-2,'Total number of license issued'])/2
print("最小",df6.loc[df6["发行增益"] == df6["发行增益"].min()][['年份','月份']].head())
print("最大",df6.loc[df6["发行增益"] == df6["发行增益"].max()][['年份','月份']].head())
(1) 哪一次拍卖的中标率首次小于 5%?
15-May
(2) 按年统计拍卖最低价的下列统计量:最大值、均值、0.75 分位数,要求
显示在同一张表上。
年份年份 max mean 0.75
2002 30800.0 20316.666667 24300.0
2003 38500.0 31983.333333 36300.0
2004 44200.0 29408.333333 38400.0
2005 37900.0 31908.333333 35600.0
2006 39900.0 37058.333333 39525.0
2007 53800.0 45691.666667 48950.0
2008 37300.0 29945.454545 34150.0
2009 36900.0 31333.333333 34150.0
2010 44900.0 38008.333333 41825.0
2011 53800.0 47958.333333 51000.0
2012 68900.0 61108.333333 65325.0
2013 90800.0 79125.000000 82550.0
2014 74600.0 73816.666667 74000.0
2015 85300.0 80575.000000 83450.0
2016 88600.0 85733.333333 87475.0
2017 93500.0 90616.666667 92350.0
2018 89000.0 87825.000000 88150.0
(3) 将第一列时间列拆分成两个列,一列为年份(格式为 20××),另一列为
月份(英语缩写),添加到列表作为第一第二列,并将原表第一列删除,
其他列依次向后顺延。
年份年份 月份月份
Total number of license
issued
lowest
price
avg
price
Total number of
applicants
0 2002 Jan 1400 13600 14735 3718
1 2002 Feb 1800 13100 14057 4590
2 2002 Mar 2000 14300 14662 5190
3 2002 Apr 2300 16000 16334 4806
4 2002 May 2350 17800 18357 4665
(4) 现在将表格行索引设为多级索引,外层为年份,内层为原表格第二至第
五列的变量名,列索引为月份。
年份年份 月份月份 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2002 Total number of applicants 3718.0 4590 5190.0 4806.0 4665.0 4502.0 3774.0 4640.0 4393.0 4661.0 4021.0 3525.0
Total number of license
issued
1400.0 1800 2000.0 2300.0 2350.0 2800.0 3000.0 3000.0 3200.0 3200.0 3200.0 3600.0
avg price 14735.0 14057 14662.0 16334.0 18357.0 20178.0 20904.0 21601.0 24040.0 27040.0 31721.0 27848.0
lowest price 13600.0 13100 14300.0 16000.0 17800.0 19600.0 19800.0 21000.0 23600.0 26400.0 30800.0 27800.0
2003 Total number of applicants 9442.0 12030 11219.0 8794.0 14634.0 15507.0 11929.0 9315.0 8532.0 9383.0 9849.0 10491.0
… … … … … … … … … … … … … …
2017 lowest price 87600.0 88200 87800.0 89800.0 90100.0 89400.0 92200.0 91600.0 91300.0 93500.0 93100.0 92800.0
2018 Total number of applicants 226316.0 220831 217056.0 204980.0 198627.0 209672.0 202337.0 192755.0 189142.0 181861.0 177355.0 165442.0
Total number of license
issued
12183.0 11098 9855.0 11916.0 10216.0 10775.0 10395.0 10402.0 12712.0 10728.0 11766.0 12850.0