fun_window = Window.partitionBy(primary_keys_list). \ orderBy([F.col(field).desc() for field in self.config.deduplication_order_columns.split(',')])转换为sql
时间: 2024-03-28 16:42:11 浏览: 48
这段代码的作用是按照指定的主键列表进行分区,并按照指定的去重排序列进行降序排序。相应的SQL语句如下:
```
SELECT *, ROW_NUMBER() OVER (PARTITION BY primary_key1, primary_key2, ... ORDER BY field1 DESC, field2 DESC, ...) as row_num
FROM table_name
```
其中,`primary_key1, primary_key2, ...`为主键列,`field1, field2, ...`为去重排序列,`table_name`为表名。`ROW_NUMBER()`函数用于给每行数据编号,`PARTITION BY`用于指定分区列,`ORDER BY`用于指定排序列。
相关问题
fun_window = Window.partitionBy(primary_keys_list). \ orderBy([F.col(field).desc() for field in self.config.deduplication_order_columns.split(',')]) dataframe_ordered = df.withColumn("row_num_order", F.row_number().over(fun_window)) source_data_df_process = dataframe_ordered.filter(F.col("row_num_order") == 1) df = source_data_df_process.drop("row_num_order")转换为sql语句
SELECT *
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY primary_key1, primary_key2, ..., primary_keyn
ORDER BY field1 DESC, field2 DESC, ..., fieldn DESC) AS row_num_order
FROM table_name
) t
WHERE t.row_num_order = 1;
阅读全文