name = 'User_id' # 折扣率排序 datas[name_prifix + 'discount_rateRank'] = datas.groupby(name)['discount_rate'].rank(ascending = False) datas[name_prifix + 'discount_rateRankAscend'] = datas.groupby(name)['discount_rate'].rank(ascending = True) # 距离排序 datas[name_prifix + 'distanceRank'] = datas.groupby(name)['Distance'].rank(ascending = False) datas[name_prifix + 'distanceRankAscend'] = datas.groupby(name)['Distance'].rank(ascending = True) # 领券日期排序 datas[name_prifix + 'date_receivedRank'] =datas.groupby(name)['Date_received'].rank(ascending = False) datas[name_prifix + 'date_receivedRankAscend'] = datas.groupby(name)['Date_received'].rank(ascending = True) return datas请帮我用groupby函数改写
时间: 2024-02-05 12:04:14 浏览: 68
可以改写如下:
```
def add_rank_features(df, id_col='User_id', rank_cols=['discount_rate', 'Distance', 'Date_received']):
for col in rank_cols:
df[f'{col}_rank_desc'] = df.groupby(id_col)[col].rank(method='dense', ascending=False)
df[f'{col}_rank_asc'] = df.groupby(id_col)[col].rank(method='dense', ascending=True)
return df
```
其中,`id_col` 为分组列名,`rank_cols` 为需要排序的列名列表。`method='dense'` 表示按照值的密度排序,相同值的排名相同,不会有空缺的排名。
相关问题
name = 'Merchant_id' # 折扣率排序 datas[name_prifix + 'discount_rateRank'] =datas.groupby(name)['discount_rate'].rank(ascending = False) datas[name_prifix + 'discount_rateRankAscend'] = datas.groupby(name)['discount_rate'].rank(ascending = True) # 距离排序 datas[name_prifix + 'distanceRank'] = datas.groupby(name)['Distance'].rank(ascending = False) datas[name_prifix + 'distanceRankAscend']=datas.groupby(name)['Distance'].rank(ascending = True) # 领券日期排序 datas[name_prifix + 'date_receivedRank'] =datas.groupby(name)['Date_received'].rank(ascending = False) datas[name_prifix + 'date_receivedRankAscend'] = datas.groupby(name)['Date_received'].rank(ascending = True)请帮我用groupby改写
好的,我可以帮你用groupby重写这段代码。你可以这样写:
```
cols = ['discount_rate', 'Distance', 'Date_received']
for col in cols:
datas[name_prifix + col + 'Rank'] = datas.groupby(name)[col].rank(ascending=False)
datas[name_prifix + col + 'RankAscend'] = datas.groupby(name)[col].rank(ascending=True)
```
这样,你就能用groupby对每一列进行操作,避免了多次重复写代码的问题。
name = 'Coupon_id' # 折扣率排序 datas[name_prifix + 'discount_rateRank'] = datas.groupby(name)['discount_rate'].rank(ascending = False) datas[name_prifix + 'discount_rateRankAscend'] = datas.groupby(name)['discount_rate'].rank(ascending = True) # 距离排序 datas[name_prifix + 'distanceRank'] = datas.groupby(name)['Distance'].rank(ascending = False) datas[name_prifix + 'distanceRankAscend'] = datas.groupby(name)['Distance'].rank(ascending = True) # 领券日期排序 datas[name_prifix + 'date_receivedRank'] =datas.groupby(name)['Date_received'].rank(ascending = False) datas[name_prifix + 'date_receivedRankAscend'] =datas.groupby(name)['Date_received'].rank(ascending = True) 请帮我用groupby函数改写
Sure! 这段代码可以使用 `groupby` 函数进行改写,如下所示:
```
cols_to_rank = ['discount_rate', 'Distance', 'Date_received']
grouped = datas.groupby(name)
for col in cols_to_rank:
datas[name_prifix + col + 'Rank'] = grouped[col].rank(ascending=False)
datas[name_prifix + col + 'RankAscend'] = grouped[col].rank(ascending=True)
```
这样,我们可以避免重复代码并且更加简洁明了。
阅读全文