请问如何利用pandas函数代替下列代码:def generate_time_series(start,end,t1,t2): current = start end_item = end time_delta = datetime.timedelta(days=t1,hours=t2) time_series = [current] while current < end_item: next1 = current + time_delta time_series.append(next1) current = next1 return time_series time_hour=generate_time_series(data1.loc[0]['FDT_TIME'],data1.loc[len(data1)-1]['FDT_TIME'],0,1) time_day=generate_time_series(data1.loc[0]['FDT_TIME'],data1.loc[len(data1)-1]['FDT_TIME'],1,0) def VALIDITY(timelist,thedata): datalist1=[] datalist2=[] thedata1=thedata.values for i in range(1,len(timelist)): TFlis=[] for j in range(len(thedata1)): if timelist[i-1]<=thedata1[j][2]<timelist[i]: TFlis.append(thedata1[j][6]) datalist1.append(TFlis) for i in datalist1: datalist2.append((i.count("T"))/len(i)) return datalist2 x1=[] for i in range(0,len(VALIDITY(time_hour, data1))): x1.append(i) x2=[] for i in range(0,len(VALIDITY(time_day, data1))): x2.append(i)
时间: 2023-06-09 21:06:54 浏览: 95
可以使用pandas的date_range函数代替generate_time_series函数:
time_hour = pd.date_range(start=data1.loc[0]['FDT_TIME'], end=data1.loc[len(data1)-1]['FDT_TIME'], freq='H')
time_day = pd.date_range(start=data1.loc[0]['FDT_TIME'], end=data1.loc[len(data1)-1]['FDT_TIME'], freq='D')
然后,使用pandas的groupby和agg函数可以简化VALIDITY函数:
def VALIDITY(timelist, thedata):
thedata1 = thedata.groupby(pd.cut(thedata['FDT_TIME'], timelist)).agg({'TF': lambda x: x.tolist().count("T")/len(x.tolist())}).values.tolist()
datalist2 = [i[0] for i in thedata1]
return datalist2
x1 = list(range(len(VALIDITY(time_hour, data1))))
x2 = list(range(len(VALIDITY(time_day, data1))))
相关问题
请问pandas中有什么可以代替这个函数:def generate_time_series(start,end,t1,t2): current = start end_item = end time_delta = datetime.timedelta(days=t1,hours=t2) time_series = [current] while current < end_item: next1 = current + time_delta time_series.append(next1) current = next1 return time_series
你可以使用pandas中的date_range函数代替该函数。它可以生成一个时间序列,可以指定开始时间、结束时间以及时间间隔。例如,可以使用以下代码生成与您的函数相同的时间序列:
import pandas as pd
import datetime
start = datetime.datetime(2022, 1, 1)
end = datetime.datetime(2022, 1, 10)
t1 = 1
t2 = 0
time_series = pd.date_range(start=start, end=end, freq=f'{t1}D{t2}H').tolist()
print(time_series)
用pandas函数代替下列语句:def generate_time_series(start,end,t1,t2): current = start end_item = end time_delta = datetime.timedelta(days=t1,hours=t2) time_series = [current] while current < end_item: next1 = current + time_delta time_series.append(next1) current = next1 return time_series time_hour=generate_time_series(data1.loc[0]['FDT_TIME'],data1.loc[len(data1)-1]['FDT_TIME'],0,1) time_require=[time_hour[160],time_hour[162]] data1_require=data1.drop(data1[(data1['FDT_TIME']<time_require[0])].index) data1_require=data1_require.drop(data1_require[(data1_require['FDT_TIME']>time_require[1])].index) data1_require=data1_require[['FINT_SPEED']]
可以使用pandas.date_range()函数替换generate_time_series()函数来生成时间序列,用pandas.DataFrame.loc[]函数替换data1的切片操作,用pandas.DataFrame.drop()函数替换删除操作,代码如下:
import pandas as pd
import datetime
start = data1.loc[0]['FDT_TIME']
end = data1.loc[len(data1)-1]['FDT_TIME']
time_delta = datetime.timedelta(days=0, hours=1)
time_hour = pd.date_range(start=start, end=end, freq=time_delta)
time_require = [time_hour[160], time_hour[162]]
data1_require = data1.loc[(data1['FDT_TIME'] >= time_require[0]) & (data1['FDT_TIME'] <= time_require[1])]
data1_require = data1_require[['FINT_SPEED']]
阅读全文