hive io.compression.codecs
时间: 2024-05-09 14:20:33 浏览: 96
hive笔记.md的
Hive `io.compression.codecs` is a configuration property that specifies the list of compression codecs to be used by Hive for compressing and decompressing data. By default, Hive supports the following compression codecs:
- `org.apache.hadoop.io.compress.DefaultCodec`: The default codec for Hadoop, which uses the deflate algorithm for compression.
- `org.apache.hadoop.io.compress.GzipCodec`: The gzip codec for compression.
- `org.apache.hadoop.io.compress.BZip2Codec`: The bzip2 codec for compression.
- `org.apache.hadoop.io.compress.SnappyCodec`: The snappy codec for compression.
- `org.apache.hadoop.io.compress.Lz4Codec`: The lz4 codec for compression.
To specify the compression codecs, you can set the `io.compression.codecs` property in the `hive-site.xml` configuration file:
```
<property>
<name>io.compression.codecs</name>
<value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec,org.apache.hadoop.io.compress.Lz4Codec</value>
</property>
```
You can also specify the compression codec for a specific table or partition using the `STORED AS` clause in the `CREATE TABLE` statement:
```
CREATE TABLE my_table (
...
)
PARTITIONED BY (dt string)
STORED AS PARQUET
TBLPROPERTIES ("parquet.compression"="snappy");
```
In this example, the data for the `my_table` table is stored in the Parquet format with the snappy codec for compression.
阅读全文