将out.loc[:, '涨跌'] = out.apply(lambda row: "涨" if row[max_col[3]] > row[min_col[3]] else "跌" if row[max_col[3]] < row[min_col[3]] else "\\", axis=1)改用def函数写
时间: 2023-09-21 10:06:24 浏览: 39
可以将lambda函数改写为一个名为`get_zhang_die`的函数,具体代码如下:
```
def get_zhang_die(row, max_col, min_col):
if row[max_col[3]] > row[min_col[3]]:
return "涨"
elif row[max_col[3]] < row[min_col[3]]:
return "跌"
else:
return "\\"
out.loc[:, '涨跌'] = out.apply(get_zhang_die, args=(max_col, min_col), axis=1)
```
这里使用了`args`参数将`max_col`和`min_col`传递给`get_zhang_die`函数。
相关问题
解释一下每行代码 ori_data = pd.read_csv('F:/patient.csv') ori_data.loc[:, 'id'] = ori_data.loc[:, 'patient_id'].apply(lambda x: x[:7]) index = set(ori_data['id']) column = list(ori_data.columns) del column[0] del column[-1] average = pd.DataFrame(index=index, columns=column) for k in column: result = ori_data.groupby('id')[k].mean() average.loc[:, k] = result column = list(average.columns) data = average.loc[:, column[0]:column[-3]] # 自变量 target = average.loc[:, ['TIMEsurvival', 'EVENTdeath']],
Sure, 这段代码的作用是对'F:/patient.csv'中的数据进行处理,提取出自变量和目标变量。下面分行解释每段代码的作用:
1. `ori_data = pd.read_csv('F:/patient.csv')`:从磁盘中读取'F:/patient.csv'文件中的数据,存储为一个名为ori_data的DataFrame。
2. `ori_data.loc[:, 'id'] = ori_data.loc[:, 'patient_id'].apply(lambda x: x[:7])`:在原始数据DataFrame中增加一列'id',其中每个元素为'patient_id'的前7个字符。这一步目的是为了将所有ID转换为统一的格式,方便后面的计算。
3. `index = set(ori_data['id'])`:获取所有唯一的id值,存储为一个集合。
4. `column = list(ori_data.columns)`:获取所有列名,存储为一个列表。
5. `del column[0]`:删除第一列'patient_id',因为已经用'id'列代替。
6. `del column[-1]`:删除最后一列,即'EVENTdeath',因为这一列是目标变量。
7. `average = pd.DataFrame(index=index, columns=column)`:创建一个名为'average'的DataFrame,其中行是每个病人的ID,列是每个特征的名称。
8. `for k in column: result = ori_data.groupby('id')[k].mean() average.loc[:, k] = result`:对于每一列特征,计算每个病人的平均值,并将这些平均值存储在'average' DataFrame中。
9. `column = list(average.columns)`:获取'average' DataFrame中所有列的名称,存储为一个列表。
10. `data = average.loc[:, column[0]:column[-3]]`:从'average' DataFrame中提取自变量,即所有特征列,除了最后两列'TIMEsurvival'和'EVENTdeath'。
11. `target = average.loc[:, ['TIMEsurvival', 'EVENTdeath']]`:从'average' DataFrame中提取目标变量,即最后两列'TIMEsurvival'和'EVENTdeath'。
import pandas as pd df = pd.read_csv('stock_data.csv') df['four_days_increase'] = df['close'].rolling(window=4).apply(lambda x: all(x[i] < x[i+1] for i in range(3))) * 1 df['three_days_decrease'] = df['close'].rolling(window=3).apply(lambda x: all(x[i] > x[i+1] for i in range(2))) * 1 capital = 1000000 max_stock_per_day = 10 max_stock_value = 100000 start_date = '2020-01-01' end_date = '2023-01-01' df = df[(df['date'] >= start_date) & (df['date'] < end_date)] df = df.reset_index(drop=True) hold_stock = [] for i, row in df.iterrows(): if len(hold_stock) > 0: sell_stock = [] for stock in hold_stock: if i - stock['buy_day'] >= 3: capital += stock['buy_price'] * stock['buy_qty'] * (1 - 0.002) sell_stock.append(stock) hold_stock = [stock for stock in hold_stock if stock not in sell_stock] df_today = df.loc[i:i+3] if i + 3 >= len(df): break if all(df_today['four_days_increase']) and all(df_today['three_days_decrease'].iloc[1:]): available_capital = capital available_stock = max_stock_per_day available_value = max_stock_value for j, stock_row in df_today.iterrows(): if available_capital > 0 and available_stock > 0 and available_value > 0: buy_qty = min(int(available_capital / (stock_row['close'] * 1.002)), available_stock, int(available_value / (stock_row['close'] * 1.002))) if buy_qty > 0: hold_stock.append({'buy_day': i, 'buy_price': stock_row['close'], 'buy_qty': buy_qty}) available_capital -= stock_row['close'] * buy_qty * 1.002 available_stock -= 1 available_value -= stock_row['close'] * buy_qty * 1.002 print('Final capital:', capital)让上述代码在jupyter里不报错
这段代码的错误可能是由于缺少依赖库或文件。请确保您已经安装了pandas库并且已经将stock_data.csv文件放在正确的位置。另外,您可能需要在Jupyter Notebook中使用`%matplotlib inline`来确保图表正确显示。以下是可能的修改:
```
import pandas as pd
%matplotlib inline
df = pd.read_csv('stock_data.csv')
df['four_days_increase'] = df['close'].rolling(window=4).apply(lambda x: all(x[i] < x[i+1] for i in range(3))) * 1
df['three_days_decrease'] = df['close'].rolling(window=3).apply(lambda x: all(x[i] > x[i+1] for i in range(2))) * 1
capital = 1000000
max_stock_per_day = 10
max_stock_value = 100000
start_date = '2020-01-01'
end_date = '2023-01-01'
df = df[(df['date'] >= start_date) & (df['date'] < end_date)]
df = df.reset_index(drop=True)
hold_stock = []
for i, row in df.iterrows():
if len(hold_stock) > 0:
sell_stock = []
for stock in hold_stock:
if i - stock['buy_day'] >= 3:
capital += stock['buy_price'] * stock['buy_qty'] * (1 - 0.002)
sell_stock.append(stock)
hold_stock = [stock for stock in hold_stock if stock not in sell_stock]
df_today = df.loc[i:i+3]
if i + 3 >= len(df):
break
if all(df_today['four_days_increase']) and all(df_today['three_days_decrease'].iloc[1:]):
available_capital = capital
available_stock = max_stock_per_day
available_value = max_stock_value
for j, stock_row in df_today.iterrows():
if available_capital > 0 and available_stock > 0 and available_value > 0:
buy_qty = min(int(available_capital / (stock_row['close'] * 1.002)), available_stock, int(available_value / (stock_row['close'] * 1.002)))
if buy_qty > 0:
hold_stock.append({'buy_day': i, 'buy_price': stock_row['close'], 'buy_qty': buy_qty})
available_capital -= stock_row['close'] * buy_qty * 1.002
available_stock -= 1
available_value -= stock_row['close'] * buy_qty * 1.002
print('Final capital:', capital)
```