hive.skewjoin.mapjoin.map.tasks
时间: 2024-02-09 22:46:50 浏览: 90
jdbc-hive.jar.rar
This is a configuration property in Apache Hive, which specifies the number of map tasks to use during a skew join operation with a map join.
A skew join is a type of join operation where one or more keys have a disproportionate number of matches in the input data, causing some map tasks to take much longer than others. To address this issue, Hive can use a map join, which loads the smaller table into memory and performs the join with a distributed cache.
The `hive.skewjoin.mapjoin.map.tasks` property determines how many map tasks should be used during this operation. Increasing this value can help improve performance by distributing the workload across more tasks, but it can also increase memory usage and overhead. The default value is 100.
阅读全文