SELECT * FROM dw.ol_autotask_model_channels_predict_content WHERE dt BETWEEN '20230601' AND '20230630' AND predict in ('other','middle') AND (cust_id, dt) not in ( SELECT cust_id, dt FROM dw.ol_autotask_model_channels_predict_content WHERE dt BETWEEN '20230601' AND '20230630' AND predict = 'travel' ) AND cust_id is not null
时间: 2024-02-14 08:24:51 浏览: 30
您的第一个问题是关于SQL查询的。您想要在2023年6月日至2023年月30日期间,预为"other"或"middle"且不包含预测为"travel"的客户信息。同时,您希望排除_id为空的记录。
以下是您的SQL查询句:
```sql
SELECT *
FROM dw.ol_autask_model_channels_predict_content
WHERE dt BETWEEN '20230601' AND '20230630'
AND predict IN ('other', 'middle')
AND (cust_id, dt) NOT IN (
SELECT cust_id, dt
FROM dw.ol_autotask_model_channels_predict_content
WHERE dt BETWEEN '20230601' AND '20230630'
AND predict = 'travel'
)
AND cust_id IS NOT NULL;
```
请注意,这个查询涉及到两个表的自我关联,并使用了子查询来排除预测为"travel"的记录。同时,通过使用`NOT IN`和`IS NOT NULL`条件来排除cust_id为空的记录。
请确保替换表名、字段名和日期范围以适应您的实际数据库结构和数据。
相关问题
insert into table dw.temp_ol_autotask_model_channels_predict partition (dt BETWEEN '20230601' AND '20230630') select from from dw.ol_autotask_model_channels_predict_content a join (select cust_id, sum(case when predict in ('other', 'middle') then 0 else 1 end) predict_day, dt from dw.ol_autotask_model_channels_predict_content where dt = '20230601' group by cust_id, dt ) b on a.cust_id = b.cust_id where a.dt='20230601' and b.predict_day = 0
根据你提供的代码,你想将查询结果插入到 `dw.temp_ol_autotask_model_channels_predict` 表的 `dt` 分区中。以下是你提供的代码的修改版本:
```sql
INSERT INTO TABLE dw.temp_ol_autotask_model_channels_predict
PARTITION (dt)
SELECT *
FROM dw.ol_autotask_model_channels_predict_content a
JOIN (
SELECT cust_id, SUM(CASE WHEN predict IN ('other', 'middle') THEN 0 ELSE 1 END) AS predict_day, dt
FROM dw.ol_autotask_model_channels_predict_content
WHERE dt = '20230601'
GROUP BY cust_id, dt
) b ON a.cust_id = b.cust_id
WHERE a.dt = '20230601'
AND b.predict_day = 0
AND a.dt BETWEEN '20230601' AND '20230630';
```
在上述代码中,我们使用 `INSERT INTO` 语句将查询结果插入到 `dw.temp_ol_autotask_model_channels_predict` 表。通过 `PARTITION (dt)` 指定了分区字段为 `dt`。在 `SELECT` 语句中,我们选择了所有列使用 `SELECT *`。通过 `JOIN` 子句连接了两个表,并使用子查询获取了符合条件的数据。最后使用 `WHERE` 子句限制了日期范围为 `'20230601'` 到 `'20230630'`。
请根据你的实际情况修改表名、列名和日期范围,以适应你的需求。
SELECT a.*, b.predict_day FROM dw.ol_autotask_model_channels_predict_content a JOIN ( SELECT cust_id, dt, SUM(CASE WHEN predict IN ('other', 'middle') THEN 0 ELSE 1 END) AS predict_day FROM dw.ol_autotask_model_channels_predict_content WHERE dt BETWEEN '20230601' AND '20230630' GROUP BY cust_id, dt ) b ON a.cust_id = b.cust_id AND a.dt = b.dt WHERE b.predict_day = 0 and a.dt BETWEEN '20230601' AND '20230630' ;
你的查询看起来是正确的,但是在这种情况下,你可以进一步简化查询语句。以下是优化后的查询代码:
```sql
SELECT a.*, b.predict_day
FROM dw.ol_autotask_model_channels_predict_content a
JOIN (
SELECT cust_id, dt, COUNT(*) AS predict_day
FROM dw.ol_autotask_model_channels_predict_content
WHERE dt BETWEEN '20230601' AND '20230630' AND predict NOT IN ('other', 'middle')
GROUP BY cust_id, dt
HAVING COUNT(*) = 0
) b ON a.cust_id = b.cust_id AND a.dt = b.dt
WHERE a.dt BETWEEN '20230601' AND '20230630';
```
在上述查询中,我们使用了一个内部子查询来计算 `predict_day`,并通过 `HAVING COUNT(*) = 0` 条件筛选出 `predict_day` 为 0 的记录。同时,我们将 `predict NOT IN ('other', 'middle')` 条件移至子查询中,以进一步简化查询。
请注意,上述优化仅是基于你提供的查询进行的。根据实际数据量和索引情况,可能还有其他优化方法可供考虑,例如添加适当的索引或分区策略。