school_df = school_df.dropna(thresh=len(school_df)*0.9, axis=1)

This line of code drops columns in the dataset 'school_df' that have missing values more than 10% of the total number of rows. The parameter 'thresh' indicates the minimum number of non-null values that a column should have to be retained, and 'axis=1' specifies that the operation should be applied across columns. This code helps in cleaning the dataset by removing irrelevant or incomplete columns.

param = {'num_leaves': 31, 'min_data_in_leaf': 20, 'objective': 'binary', 'learning_rate': 0.06, "boosting": "gbdt", "metric": 'None', "verbosity": -1} trn_data = lgb.Dataset(trn, trn_label) val_data = lgb.Dataset(val, val_label) num_round = 666 # clf = lgb.train(param, trn_data, num_round, valid_sets=[trn_data, val_data], verbose_eval=100, # early_stopping_rounds=300, feval=win_score_eval) clf = lgb.train(param, trn_data, num_round) # oof_lgb = clf.predict(val, num_iteration=clf.best_iteration) test_lgb = clf.predict(test, num_iteration=clf.best_iteration)thresh_hold = 0.5 oof_test_final = test_lgb >= thresh_hold print(metrics.accuracy_score(test_label, oof_test_final)) print(metrics.confusion_matrix(test_label, oof_test_final)) tp = np.sum(((oof_test_final == 1) & (test_label == 1))) pp = np.sum(oof_test_final == 1) print('accuracy1:%.3f'% (tp/(pp)))test_postive_idx = np.argwhere(oof_test_final == True).reshape(-1) # test_postive_idx = list(range(len(oof_test_final))) test_all_idx = np.argwhere(np.array(test_data_idx)).reshape(-1) stock_info['trade_date_id'] = stock_info['trade_date'].map(date_map) stock_info['trade_date_id'] = stock_info['trade_date_id'] + 1tmp_col = ['ts_code', 'trade_date', 'trade_date_id', 'open', 'high', 'low', 'close', 'ma5', 'ma13', 'ma21', 'label_final', 'name'] stock_info.iloc[test_all_idx[test_postive_idx]] tmp_df = stock_info[tmp_col].iloc[test_all_idx[test_postive_idx]].reset_index() tmp_df['label_prob'] = test_lgb[test_postive_idx] tmp_df['is_limit_up'] = tmp_df['close'] == tmp_df['high'] buy_df = tmp_df[(tmp_df['is_limit_up']==False)].reset_index() buy_df.drop(['index', 'level_0'], axis=1, inplace=True)buy_df['buy_flag'] = 1 stock_info_copy['sell_flag'] = 0tmp_idx = (index_df['trade_date'] == test_date_min+1) close1 = index_df[tmp_idx]['close'].values[0] test_date_max = 20220829 tmp_idx = (index_df['trade_date'] == test_date_max) close2 = index_df[tmp_idx]['close'].values[0]tmp_idx = (stock_info_copy['trade_date'] >= test_date_min) & (stock_info_copy['trade_date'] <= test_date_max) tmp_df = stock_info_copy[tmp_idx].reset_index(drop=True)from imp import reload import Account reload(Account) money_init = 200000 account = Account.Account(money_init, max_hold_period=20, stop_loss_rate=-0.07, stop_profit_rate=0.12) account.BackTest(buy_df, tmp_df, index_df, buy_price='open')tmp_df2 = buy_df[['ts_code', 'trade_date', 'label_prob', 'label_final']] tmp_df2 = tmp_df2.rename(columns={'trade_date':'buy_date'}) tmp_df = account.info tmp_df['buy_date'] = tmp_df['buy_date'].apply(lambda x: int(x)) tmp_df = tmp_df.merge(tmp_df2, on=['ts_code', 'buy_date'], how='left')最终的tmp_df是什么？tmp_df[tmp_df['label_final']==1]又选取了什么股票？

根据代码逐行分析，tmp_df 是一个 DataFrame，包含了股票的信息以及回测结果。其中，选取了 label_final 为 1 的股票，也就是模型预测为涨的股票，并且过滤掉了当天涨停的股票。最终买入的股票信息保存在 buy_df 中，回测时将其与 stock_info_copy、index_df 进行合并，得到了回测的结果，并将买入的股票信息与回测结果合并，生成了最终的 tmp_df。

vm.swappiness=0 net.ipv4.ip_forward=1 net.bridge.bridge-nf-call-iptables=1 net.bridge.bridge-nf-call-ip6tables = 1 net.ipv4.neigh.default.gc_thresh1=4096 net.ipv4.neigh.default.gc_thresh2=6144 net.ipv4.neigh.default.gc_thresh3=8192

这些是一些常见的Linux内核参数配置，用于优化网络和系统性能。具体含义如下: - vm.swappiness=0：这个参数控制系统在内存不足时使用交换空间的程度。将其设置为0可以减少系统使用交换空间的频率，提高性能。 - net.ipv4.ip_forward=1：这个参数用于启用IPv4数据包转发功能，允许Linux系统作为路由器转发IP数据包。 - net.bridge.bridge-nf-call-iptables=1：这个参数用于启用iptables桥接模块，允许iptables对桥接的数据包进行过滤。 - net.bridge.bridge-nf-call-ip6tables=1：这个参数用于启用ip6tables桥接模块，允许ip6tables对桥接的IPv6数据包进行过滤。 - net.ipv4.neigh.default.gc_thresh1=4096：这个参数用于设置ARP缓存清理的阈值。当ARP缓存中的条目数量超过该阈值时，系统将启动清理操作。 - net.ipv4.neigh.default.gc_thresh2=6144：这个参数是第二个清理阈值，当ARP缓存中的条目数量超过该阈值时，系统将进一步加大清理力度。 - net.ipv4.neigh.default.gc_thresh3=8192：这个参数是第三个清理阈值，当ARP缓存中的条目数量超过该阈值时，系统将以最大力度进行清理。这些参数的具体配置需要根据系统需求和网络环境进行调整。请确保在修改这些参数之前了解其含义和潜在影响，并谨慎操作。

阅读全文

school_df = school_df.dropna(thresh=len(school_df)*0.9, axis=1)

vm.swappiness=0 net.ipv4.ip_forward=1 net.bridge.bridge-nf-call-iptables=1 net.bridge.bridge-nf-call-ip6tables = 1 net.ipv4.neigh.default.gc_thresh1=4096 net.ipv4.neigh.default.gc_thresh2=6144 net.ipv4.neigh.default.gc_thresh3=8192

相关推荐

学校数据

《Python金融数据分析》复习资料(1).docx

binary_image.rar_binary opencv_image binary_opencv binary_thresh

thresh_left = 200 thresh_right = img.shape[1] - 200 thresh_top = 100 thresh_botton = img.shape[0] - 100 mask = np.zeros(img.shape,img.dtype) mask[thresh_top:thresh_botton,thresh_left:thresh_right] = 1 thresh[mask ==0] =0

[yolo] mask = 6,7,8 anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 classes=80 num=9 jitter=.3 ignore_thresh = .7 truth_thresh = 1 random=1

ret, img_thresh = cv2.threshold(img_opening, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU) img_edge = cv2.Canny(img_thresh, 100, 200) 。 能详细解释这段代码吗

关于组织参加“第八届‘泰迪杯’数据挖掘挑战赛”的通知-4页

PyMySQL-1.1.0rc1.tar.gz

最新推荐

OpenCV stitching_detailed.cpp解读

关于组织参加“第八届‘泰迪杯’数据挖掘挑战赛”的通知-4页

PyMySQL-1.1.0rc1.tar.gz

Aspose资源包：转PDF无水印学习工具

管理建模和仿真的文件

【R语言高性能计算秘诀】：代码优化，提升分析效率的专家级方法

在构建视频会议系统时，如何通过H.323协议实现音视频流的高效传输，并确保通信的稳定性？

Go语言控制台输入输出操作教程

"互动学习：行动中的多样性与论文攻读经历"

【R语言机器学习新手起步】：caret包带你进入预测建模的世界

ret, img_thresh = cv2.threshold(img_opening, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU) img_edge = cv2.Canny(img_thresh, 100, 200) 。能详细解释这段代码吗