count over partition by 可以去重?
时间: 2024-04-25 15:21:21 浏览: 163
Partition技术
回答: 是的,count over partition by 可以去重。在Hive和Spark中,可以使用collect_set函数来实现去重,并使用size函数来计算去重后的个数。通过将count(distinct ) over(partition by) 替换成 size(collect_set() over(partition by)),可以实现分组后的去重计数。这种方法适用于需要保持原数据明细表不变,同时需要进行分组统计的场景。\[1\]
#### 引用[.reference_title]
- *1* *2* [【hive&spark】HiveSql&SparkSql—COUNT(DISTINCT ) OVER (PARTITION BY )报错解决方案](https://blog.csdn.net/qyj19920704/article/details/126372968)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item]
- *3* [HiveSql&SparkSql—COUNT(DISTINCT ) OVER (PARTITION BY )报错解决方案](https://blog.csdn.net/qq_41018861/article/details/117330116)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item]
[ .reference_list ]
阅读全文