df['eff'] = df[[col_name for col_name in df.columns if '_tb' in col_name]].sum(axis=1)解释
时间: 2023-12-05 20:05:59 浏览: 33
这段代码是使用 pandas 库在 DataFrame 中进行操作。让我来为您解释每一部分的含义:
`df['eff']`:这是在 DataFrame `df` 中创建一个名为 'eff' 的新列。
`df.columns`:这是 DataFrame 的一个属性,它返回 DataFrame 中所有列的列表。
`[col_name for col_name in df.columns if '_tb' in col_name]`:这是一个列表推导式,它从 DataFrame 的列列表中选择具有 '_tb' 字符串的列名,并将这些列名存储在一个新的列表中。
`df[[col_name for col_name in df.columns if '_tb' in col_name]]`:这是对 DataFrame `df` 进行切片,仅选择具有 '_tb' 字符串的列。
`.sum(axis=1)`:这是对切片后的 DataFrame 进行求和操作,其中 `axis=1` 表示按行进行求和。这将计算每一行中所选列的总和,并返回一个包含总和值的 Series。
最后,将计算得到的总和值赋给新列 'eff',即每一行中所选列的总和值将存储在 'eff' 列中。
总而言之,这段代码的作用是计算 DataFrame 中所有具有 '_tb' 字符串的列的总和,并将总和值存储在 'eff' 列中。
相关问题
select * from ( select row_.*, rownum rownum_ from ( select * from ( select distinct OB.BUSI_ORDER_ID, 0 as HIS_ID, OB.BUSI_CODE, OB.CUST_ID, OB.CEASE_REASON, OB.ORDER_STATE, OB.CHANNEL_TYPE, ob.user_id, OB.IS_BATCH_ORDER, OB.APPLICATION_ID, OB.CREATE_DATE, OB.DONE_DATE, OB.EFF_DATE, OB.EXP_DATE, OB.OPER_ID, OB.ORG_ID, OB.REGION_ID, OB.NOTE, OB.PROCESS_STATE, nvl(oi.cust_name, ic.cust_name) cust_name, nvl(oc.icc_id, iu.icc_id) icc_id, nvl(oc.svc_num, iu.svc_num) svc_num, icp.cust_name parent_cust_name, icp.cust_id parent_cust_id, ol.order_list_id from ord_busi ob left join ord_offer oo on oo.busi_order_id = ob.busi_order_id and ob.user_id = oo.user_id left join info_user iu on oo.user_id = iu.user_id left join info_cust ic on ob.cust_id = ic.cust_id left join ord_cust oi on ob.cust_id = oi.cust_id and ob.busi_order_id = oi.busi_order_id left join info_cust icp on nvl(ic.parent_cust_id, oi.parent_cust_id) = icp.cust_id left join ( SELECT * FROM ord_user WHERE user_order_id IN ( SELECT MAX(user_order_id) user_order_id FROM ord_user GROUP BY busi_order_id,user_id ) ) oc on ob.user_id = oc.user_id and ob.busi_order_id = oc.busi_order_id left join ord_list ol on ob.busi_order_id = ol.busi_order_id WHERE 1 = 1 and OB.CUST_ID IN( SELECT DISTINCT CUST_ID FROM (SELECT CUST_ID, PARENT_CUST_ID FROM INFO_CUST UNION SELECT CUST_ID, PARENT_CUST_ID FROM ORD_CUST) TMP START WITH TMP.CUST_ID = '10001009208' CONNECT BY TMP.PARENT_CUST_ID = PRIOR TMP.CUST_ID ) union all select distinct OB.BUSI_ORDER_ID,OB.HIS_ID, OB.BUSI_CODE, OB.CUST_ID, OB.CEASE_REASON, OB.ORDER_STATE, OB.CHANNEL_TYPE, OB.USER_ID, OB.IS_BATCH_ORDER, OB.APPLICATION_ID, OB.CREATE_DATE, OB.DONE_DATE, OB.EFF_DATE, OB.EXP_DATE, OB.OPER_ID, OB.ORG_ID, OB.REGION_ID, OB.NOTE, OB.PROCESS_STATE, nvl(oi.cust_name,ic.cust_name) cust_name , nvl(oc.icc_id, iu.icc_id) icc_id, nvl(oc.svc_num,iu.svc_num) svc_num, icp.cust_name parent_cust_name, icp.cust_id parent_cust_id, ol.order_list_id from ord_busi_his ob left join ord_offer_his oo on oo.busi_order_id = ob.busi_order_id 优化一下
可以考虑以下优化方式:
1. 使用INNER JOIN代替LEFT JOIN:如果确保两个表中都含有匹配的行,则使用INNER JOIN可以提高查询效率。
2. 使用EXISTS代替IN:EXISTS执行速度比IN更快,因为它只需要找到匹配的第一个结果,而不是找到所有匹配的结果。
3. 简化子查询:在查询中可以尝试简化子查询,以减少查询的复杂度。
4. 使用索引:使用索引可以提高查询效率,可以根据查询中经常使用的列添加索引。
5. 避免使用SELECT *:只查询需要的列,避免使用SELECT *可以减少查询的复杂度,提高查询效率。
SELECT PIS.SHOW_FLT_DETAIL AS SHOW_FLT_DETAIL -- new , PIS.SHOW_AWB_DETAIL AS SHOW_AWB_DETAIL -- new , PIS.DISPLAY_AIRLINE_CODE AS CARRIER_CODE , DECODE(PIS.REVERT_FLOW,'N',PIS.FLOW_TYPE,DECODE(PIS.FLOW_TYPE,'I','E','I')) AS FLOW_TYPE , PIS.SHIP_TO_LOCATION AS SHIP_TO_LOCATION , PIS.INVOICE_SEQUENCE AS INVOICE_SEQUENCE , PFT.FLIGHT_DATE AS FLIGHT_DATE , PFT.FLIGHT_CARRIER_CODE AS FLIGHT_CARRIER_CODE , PFT.FLIGHT_SERIAL_NUMBER AS FLIGHT_SERIAL_NUMBER , PFT.FLOW_TYPE AS AIRCRAFT_FLOW , FAST.AIRCRAFT_SERVICE_TYPE AS AIRCRAFT_SERVICE_TYPE , PPT.AWB_NUMBER AS AWB_NUMBER , PPT.WEIGHT AS WEIGHT , PPT.CARGO_HANDLING_OPERATOR AS CARGO_HANDLING_OPERATOR , PPT.SHIPMENT_PACKING_TYPE AS SHIPMENT_PACKING_TYPE , PPT.SHIPMENT_FLOW_TYPE AS SHIPMENT_FLOW_TYPE , PPT.SHIPMENT_BUILD_TYPE AS SHIPMENT_BUILD_TYPE , PPT.SHIPMENT_CARGO_TYPE AS SHIPMENT_CARGO_TYPE , PPT.REVENUE_TYPE AS REVENUE_TYPE , PFT.JV_FLIGHT_CARRIER_CODE AS JV_FLIGHT_CARRIER_CODE , PPT.PORT_TONNAGE_UID AS PORT_TONNAGE_UID , PPT.AWB_UID AS AWB_UID , PIS.INVOICE_SEPARATION_UID AS INVOICE_SEPARATION_UID , PFT.FLIGHT_TONNAGE_UID AS FLIGHT_TONNAGE_UID FROM PN_FLT_TONNAGES PFT , FZ_AIRLINES FA , PN_TONNAGE_FLT_PORTS PTFP , PN_PORT_TONNAGES PPT , FF_AIRCRAFT_SERVICE_TYPES FAST , SR_PN_INVOICE_SEPARATIONS PIS --new , SR_PN_INVOICE_SEP_DETAILS PISD--new , SR_PN_INV_SEP_PORT_TONNAGES PISPT --new WHERE PFT.FLIGHT_OPERATION_DATE >= trunc( CASE :rundate WHEN TO_DATE('01/01/1900', 'DD/MM/YYYY') THEN ADD_MONTHS(SYSDATE,-1) ELSE ADD_MONTHS(:rundate,-1) END, 'MON') AND PFT.FLIGHT_OPERATION_DATE < trunc( CASE :rundate WHEN TO_DATE('01/01/1900', 'DD/MM/YYYY') THEN TRUNC(SYSDATE) ELSE TRUNC(:rundate) END, 'MON') AND PFT.TYPE IN ('C', 'F') AND PFT.RECORD_TYPE = 'M' AND (PFT.TERMINAL_OPERATOR NOT IN ('X', 'A') OR (PFT.TERMINAL_OPERATOR <> 'X' AND FA.CARRIER_CODE IN (SELECT * FROM SPECIAL_HANDLING_AIRLINE) AND PPT.REVENUE_TYPE IN (SELECT * FROM SPECIAL_REVENUE_TYPE) AND PPT.SHIPMENT_FLOW_TYPE IN (SELECT * FROM SPECIAL_SHIPMENT_FLOW_TYPE) AND PFT.FLIGHT_OPERATION_DATE >= (select EFF_DATE from SPECIAL_HANDLING_EFF_DATE) )) AND PFT.DELETING_DATETIME IS NULL AND FA.AIRLINE_UID = PFT.AIRLINE_UID AND FA.DELETING_DATETIME IS NULL AND PTFP.FLIGHT_TONNAGE_UID = PFT.FLIGHT_TONNAGE_UID AND PTFP.RECORD_TYPE = 'M' AND PTFP.DELETING_DATETIME IS NULL AND PPT.TONNAGE_FLIGHT_PORT_UID (+)= PTFP.TONNAGE_FLIGHT_PORT_UID AND PPT.RECORD_TYPE (+)= 'M' AND PPT.DISCREPANCY_TYPE (+)= 'NONE' AND PPT.ADJUSTMENT_INC_FLAG (+)= 'Y' AND PPT.DELETING_DATETIME (+) IS NULL AND FAST.AIRCRAFT_SERVICE_TYPE_UID = PFT.AIRCRAFT_SERVICE_TYPE_UID AND FAST.DELETING_DATETIME IS NULL AND PIS.TEMPORAL_NAME = TO_CHAR((CASE :rundate --new WHEN TO_DATE('01/01/1900', 'DD/MM/YYYY') THEN TRUNC(SYSDATE) ELSE TRUNC(:rundate) END ), 'YYYYMM') || '00' AND PIS.INVOICE_SEPARATION_UID = PISD.INVOICE_SEPARATION_UID --new AND PISD.INVOICE_SEP_DETAIL_UID = PISPT.INVOICE_SEP_DETAIL_UID --new AND PISPT.PORT_TONNAGE_UID = PPT.PORT_TONNAGE_UID --new AND PIS.PRINT_SUPPORTING_DOC = 'Y';上面是oracle的写法,请转成spark SQL的写法。
以下是转换为 Spark SQL 查询的代码:
```
SELECT
PIS.SHOW_FLT_DETAIL AS SHOW_FLT_DETAIL,
PIS.SHOW_AWB_DETAIL AS SHOW_AWB_DETAIL,
PIS.DISPLAY_AIRLINE_CODE AS CARRIER_CODE,
CASE
WHEN PIS.REVERT_FLOW = 'N' THEN PIS.FLOW_TYPE
ELSE CASE
WHEN PIS.FLOW_TYPE = 'I' THEN 'E'
ELSE 'I'
END
END AS FLOW_TYPE,
PIS.SHIP_TO_LOCATION AS SHIP_TO_LOCATION,
PIS.INVOICE_SEQUENCE AS INVOICE_SEQUENCE,
PFT.FLIGHT_DATE AS FLIGHT_DATE,
PFT.FLIGHT_CARRIER_CODE AS FLIGHT_CARRIER_CODE,
PFT.FLIGHT_SERIAL_NUMBER AS FLIGHT_SERIAL_NUMBER,
PFT.FLOW_TYPE AS AIRCRAFT_FLOW,
FAST.AIRCRAFT_SERVICE_TYPE AS AIRCRAFT_SERVICE_TYPE,
PPT.AWB_NUMBER AS AWB_NUMBER,
PPT.WEIGHT AS WEIGHT,
PPT.CARGO_HANDLING_OPERATOR AS CARGO_HANDLING_OPERATOR,
PPT.SHIPMENT_PACKING_TYPE AS SHIPMENT_PACKING_TYPE,
PPT.SHIPMENT_FLOW_TYPE AS SHIPMENT_FLOW_TYPE,
PPT.SHIPMENT_BUILD_TYPE AS SHIPMENT_BUILD_TYPE,
PPT.SHIPMENT_CARGO_TYPE AS SHIPMENT_CARGO_TYPE,
PPT.REVENUE_TYPE AS REVENUE_TYPE,
PFT.JV_FLIGHT_CARRIER_CODE AS JV_FLIGHT_CARRIER_CODE,
PPT.PORT_TONNAGE_UID AS PORT_TONNAGE_UID,
PPT.AWB_UID AS AWB_UID,
PIS.INVOICE_SEPARATION_UID AS INVOICE_SEPARATION_UID,
PFT.FLIGHT_TONNAGE_UID AS FLIGHT_TONNAGE_UID
FROM
PN_FLT_TONNAGES PFT
JOIN FZ_AIRLINES FA ON FA.AIRLINE_UID = PFT.AIRLINE_UID AND FA.DELETING_DATETIME IS NULL
JOIN PN_TONNAGE_FLT_PORTS PTFP ON PTFP.FLIGHT_TONNAGE_UID = PFT.FLIGHT_TONNAGE_UID AND PTFP.RECORD_TYPE = 'M' AND PTFP.DELETING_DATETIME IS NULL
LEFT JOIN PN_PORT_TONNAGES PPT ON PPT.TONNAGE_FLIGHT_PORT_UID = PTFP.TONNAGE_FLIGHT_PORT_UID AND PPT.RECORD_TYPE = 'M' AND PPT.DISCREPANCY_TYPE = 'NONE' AND PPT.ADJUSTMENT_INC_FLAG = 'Y' AND PPT.DELETING_DATETIME IS NULL
JOIN FF_AIRCRAFT_SERVICE_TYPES FAST ON FAST.AIRCRAFT_SERVICE_TYPE_UID = PFT.AIRCRAFT_SERVICE_TYPE_UID AND FAST.DELETING_DATETIME IS NULL
JOIN SR_PN_INVOICE_SEPARATIONS PIS ON PIS.TEMPORAL_NAME = CONCAT(YEAR(:rundate), RIGHT(CONCAT('0', MONTH(:rundate)), 2), '00') AND PIS.INVOICE_SEPARATION_UID = PISD.INVOICE_SEPARATION_UID AND PIS.PRINT_SUPPORTING_DOC = 'Y'
JOIN SR_PN_INVOICE_SEP_DETAILS PISD ON PISD.INVOICE_SEP_DETAIL_UID = PISPT.INVOICE_SEP_DETAIL_UID
JOIN SR_PN_INV_SEP_PORT_TONNAGES PISPT ON PISPT.PORT_TONNAGE_UID = PPT.PORT_TONNAGE_UID
WHERE
PFT.FLIGHT_OPERATION_DATE >= trunc(CASE
WHEN :rundate = TO_DATE('01/01/1900', 'DD/MM/YYYY') THEN ADD_MONTHS(SYSDATE, -1)
ELSE ADD_MONTHS(:rundate, -1)
END, 'MON')
AND PFT.FLIGHT_OPERATION_DATE < trunc(CASE
WHEN :rundate = TO_DATE('01/01/1900', 'DD/MM/YYYY') THEN TRUNC(SYSDATE)
ELSE TRUNC(:rundate)
END, 'MON')
AND PFT.TYPE IN ('C', 'F')
AND PFT.RECORD_TYPE = 'M'
AND (
PFT.TERMINAL_OPERATOR NOT IN ('X', 'A')
OR (
PFT.TERMINAL_OPERATOR <> 'X'
AND FA.CARRIER_CODE IN (SELECT * FROM SPECIAL_HANDLING_AIRLINE)
AND PPT.REVENUE_TYPE IN (SELECT * FROM SPECIAL_REVENUE_TYPE)
AND PPT.SHIPMENT_FLOW_TYPE IN (SELECT * FROM SPECIAL_SHIPMENT_FLOW_TYPE)
AND PFT.FLIGHT_OPERATION_DATE >= (SELECT EFF_DATE FROM SPECIAL_HANDLING_EFF_DATE)
)
)
AND PFT.DELETING_DATETIME IS NULL;
```
请注意,需要将 `TRUNC` 和 `CONCAT` 函数替换为 Spark SQL 中的相应函数,并使用 `JOIN` 语句替换 `LEFT JOIN` 语句。还需要更改零散的子查询,以便在 Spark SQL 中使用。
相关推荐
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)
![](https://csdnimg.cn/download_wenku/file_type_ask_c1.png)