2、使用sql或python提取同一用户办理多个套餐的时间是否时间重合 user_id 套餐id 套餐生效时间 套餐失效时间 001 1234 20220121 20220312 001 3423 20190423 20190516 001 5434 20220125 20220219 002 2134 20220125 20220219 002 1234 20220301 20220401 002 3423 20210712 20210815 结果示例如下: user_id 套餐id 套餐生效时间 套餐失效时间 001 1234 20220121 20220312 001 5434 20220125 20220219 回答(使用sql或python):
时间: 2023-10-16 19:33:06 浏览: 105
数据库sql语句按时间查询全部详解
5星 · 资源好评率100%
使用SQL可以通过自连接和时间重叠的判断条件来实现,具体SQL语句如下:
```
SELECT DISTINCT t1.user_id, t1.package_id, t1.start_time, t1.end_time
FROM package t1
INNER JOIN package t2 ON t1.user_id = t2.user_id AND t1.package_id <> t2.package_id
AND ((t1.start_time BETWEEN t2.start_time AND t2.end_time)
OR (t1.end_time BETWEEN t2.start_time AND t2.end_time)
OR (t2.start_time BETWEEN t1.start_time AND t1.end_time)
OR (t2.end_time BETWEEN t1.start_time AND t1.end_time))
ORDER BY t1.user_id, t1.start_time, t1.end_time;
```
使用Python可以读取数据并使用pandas库进行处理,具体代码如下:
```
import pandas as pd
# 读取数据
data = pd.read_csv('data.csv')
# 转换为时间格式
data['start_time'] = pd.to_datetime(data['start_time'], format='%Y%m%d')
data['end_time'] = pd.to_datetime(data['end_time'], format='%Y%m%d')
# 自连接并判断时间重叠
result = pd.merge(data, data, on='user_id', suffixes=('_1', '_2'))
result = result[(result['package_id_1'] != result['package_id_2']) & (
(result['start_time_1'] <= result['end_time_2']) & (result['end_time_1'] >= result['start_time_2']))]
# 输出结果
result = result[['user_id', 'package_id_1', 'start_time_1', 'end_time_1']].drop_duplicates()
result = result.sort_values(by=['user_id', 'start_time_1', 'end_time_1'])
print(result)
```
阅读全文