连续交易五天的客户id 交易日期 交易金额 sql怎么实现
时间: 2023-10-10 09:11:38 浏览: 56
可以使用 Hive SQL 来实现查询连续交易五天的客户 ID、交易日期和交易金额。假设有一个名为 `transaction` 的交易表,包含了客户 ID、交易日期和交易金额等字段,可以按照以下步骤进行查询:
1. 对交易表按照客户 ID 和交易日期进行分组,计算每个客户每天的交易金额总和。
```
SELECT customer_id, transaction_date, SUM(amount) AS total_amount
FROM transaction
GROUP BY customer_id, transaction_date
```
2. 对第一步的结果按照客户 ID 进行分组,使用 Hive 内置函数 `LAG` 和 `LEAD` 计算每个客户每天与前后两天的交易差值。如果这三天的交易金额总和都不为零,则表示客户在这三天内进行了连续交易。
```
WITH daily_transaction AS (
SELECT customer_id, transaction_date, SUM(amount) AS total_amount
FROM transaction
GROUP BY customer_id, transaction_date
)
SELECT t1.customer_id, t1.transaction_date, t1.total_amount
FROM (
SELECT customer_id, transaction_date, total_amount,
LAG(total_amount, 1, 0) OVER (PARTITION BY customer_id ORDER BY transaction_date) AS prev_amount,
LEAD(total_amount, 1, 0) OVER (PARTITION BY customer_id ORDER BY transaction_date) AS next_amount
FROM daily_transaction
) t1
WHERE t1.total_amount + t1.prev_amount + t1.next_amount > 0
```
3. 对第二步的结果进行过滤,只保留连续交易五天的客户 ID、交易日期和交易金额。
```
WITH daily_transaction AS (
SELECT customer_id, transaction_date, SUM(amount) AS total_amount
FROM transaction
GROUP BY customer_id, transaction_date
), transaction_diff AS (
SELECT t1.customer_id, t1.transaction_date, t1.total_amount,
LAG(total_amount, 1, 0) OVER (PARTITION BY customer_id ORDER BY transaction_date) AS prev_amount,
LEAD(total_amount, 1, 0) OVER (PARTITION BY customer_id ORDER BY transaction_date) AS next_amount
FROM daily_transaction t1
)
SELECT t1.customer_id, t1.transaction_date, t1.total_amount
FROM (
SELECT customer_id, transaction_date, total_amount,
SUM(CASE WHEN total_amount + prev_amount + next_amount > 0 THEN 1 ELSE 0 END)
OVER (PARTITION BY customer_id ORDER BY transaction_date ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) AS trans_count
FROM transaction_diff
) t1
WHERE t1.trans_count = 5
```
上述 SQL 语句中,第一步使用 `GROUP BY` 对交易表进行分组,计算每个客户每天的交易金额总和;第二步使用 `LAG` 和 `LEAD` 函数计算每个客户每天与前后两天的交易差值,然后过滤出连续交易三天的客户;第三步使用 `SUM` 和 `OVER` 函数计算每个客户连续交易的天数,然后过滤出连续交易五天的客户。