请问pandas中有什么可以代替这个函数:def VALIDITY(timelist,thedata): datalist1=[] datalist2=[] thedata1=thedata.values for i in range(1,len(timelist)): TFlis=[] for j in range(len(thedata1)): if timelist[i-1]<=thedata1[j][2]<timelist[i]: TFlis.append(thedata1[j][6]) datalist1.append(TFlis) for i in datalist1: datalist2.append((i.count("T"))/len(i)) return datalist2
时间: 2023-06-10 09:05:47 浏览: 50
可以使用pandas的groupby函数实现类似的功能,如下所示:
```
def VALIDITY(timelist,thedata):
thedata1 = thedata.groupby(pd.cut(thedata['timestamp'], timelist))['TF'].apply(lambda x: x.eq('T').sum() / len(x)).tolist()
return thedata1
```
其中,`pd.cut`函数按照时间戳将数据分组,`apply`函数用于应用自定义函数计算每组的有效性。这种方法将计算和循环合并在了一起,效率更高。
备注:感谢用户指出关于tflis和datalist1的错误,已做出修改。
相关问题
请问如何利用pandas函数代替下列代码:def generate_time_series(start,end,t1,t2): current = start end_item = end time_delta = datetime.timedelta(days=t1,hours=t2) time_series = [current] while current < end_item: next1 = current + time_delta time_series.append(next1) current = next1 return time_series time_hour=generate_time_series(data1.loc[0]['FDT_TIME'],data1.loc[len(data1)-1]['FDT_TIME'],0,1) time_day=generate_time_series(data1.loc[0]['FDT_TIME'],data1.loc[len(data1)-1]['FDT_TIME'],1,0) def VALIDITY(timelist,thedata): datalist1=[] datalist2=[] thedata1=thedata.values for i in range(1,len(timelist)): TFlis=[] for j in range(len(thedata1)): if timelist[i-1]<=thedata1[j][2]<timelist[i]: TFlis.append(thedata1[j][6]) datalist1.append(TFlis) for i in datalist1: datalist2.append((i.count("T"))/len(i)) return datalist2 x1=[] for i in range(0,len(VALIDITY(time_hour, data1))): x1.append(i) x2=[] for i in range(0,len(VALIDITY(time_day, data1))): x2.append(i)
可以使用pandas的date_range函数代替generate_time_series函数:
time_hour = pd.date_range(start=data1.loc[0]['FDT_TIME'], end=data1.loc[len(data1)-1]['FDT_TIME'], freq='H')
time_day = pd.date_range(start=data1.loc[0]['FDT_TIME'], end=data1.loc[len(data1)-1]['FDT_TIME'], freq='D')
然后,使用pandas的groupby和agg函数可以简化VALIDITY函数:
def VALIDITY(timelist, thedata):
thedata1 = thedata.groupby(pd.cut(thedata['FDT_TIME'], timelist)).agg({'TF': lambda x: x.tolist().count("T")/len(x.tolist())}).values.tolist()
datalist2 = [i[0] for i in thedata1]
return datalist2
x1 = list(range(len(VALIDITY(time_hour, data1))))
x2 = list(range(len(VALIDITY(time_day, data1))))
请问pandas中有什么可以代替这个函数:def generate_time_series(start,end,t1,t2): current = start end_item = end time_delta = datetime.timedelta(days=t1,hours=t2) time_series = [current] while current < end_item: next1 = current + time_delta time_series.append(next1) current = next1 return time_series
你可以使用pandas中的date_range函数代替该函数。它可以生成一个时间序列,可以指定开始时间、结束时间以及时间间隔。例如,可以使用以下代码生成与您的函数相同的时间序列:
import pandas as pd
import datetime
start = datetime.datetime(2022, 1, 1)
end = datetime.datetime(2022, 1, 10)
t1 = 1
t2 = 0
time_series = pd.date_range(start=start, end=end, freq=f'{t1}D{t2}H').tolist()
print(time_series)
相关推荐
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![pdf](https://img-home.csdnimg.cn/images/20210720083512.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)