9
Chapter 3. InnoDB Data Compression
3.1. Background
Over the years, processors and cache memories have become much faster, but mass storage based on rotating magnetic
disks has not kept pace. While the storage capacity of disks has grown by about a factor of 1,000 in the past decade, random
seek times and data transfer rates are still severely limited by mechanical constraints. Therefore, many workloads are i/o-
bound. The idea of data compression is to pay a small cost in increased CPU utilization for the benefit of smaller databases
and reduced i/o to improve throughput, potentially significantly.
The ability to compress user data is an important new capability of the InnoDB Plugin. Compressed tables reduce the size
of the database on disk, resulting in fewer reads and writes needed to access the user data. For many InnoDB workloads
and many typical user tables (especially with read-intensive applications where sufficient memory is available to keep
frequently-used data in memory), compression not only significantly reduces the storage required for the database, but also
improves throughput by reducing the i/o workload, at a modest cost in processing overhead. The storage cost savings can
be important, but the reduction in i/o costs can be even more valuable.
3.2. Specifying Compression
The usual size of InnoDB data pages is 16K. Beginning with the InnoDB Plugin, you can use the attributes
ROW_FORMAT=COMPRESSED or KEY_BLOCK_SIZE in the CREATE TABLE and ALTER TABLE commands to re-
quest InnoDB to compress each page to 1K, 2K, 4K, 8K, or 16K bytes.
(The term KEY_BLOCK_SIZE may be confusing. It does not refer to a “key” at all, but simply specifies the size of
compressed pages that will be used for the table. Likewise, in the InnoDB Plugin, compression is applicable to tables, not to
individual rows, so the option ROW_FORMAT really should be TABLE_FORMAT. Unfortunately, MySQL does not permit
a storage engine to add syntax to SQL statements, so the InnoDB Plugin simply re-uses the clauses originally defined
for MyISAM).
To create a compressed table, you might use a command like this:
CREATE TABLE name (column1 INT PRIMARY KEY) ENGINE=InnoDB
ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=4;
If KEY_BLOCK_SIZE not specified, the default compressed page size of 8K will be used. If KEY_BLOCK_SIZE is
specified, the attribute ROW_FORMAT=COMPRESSED may be omitted.
Setting KEY_BLOCK_SIZE=16 most often will not result in much compression, since the normal InnoDB page size is
16K. However, this setting may be useful for tables with many long BLOB, VARCHAR or TEXT columns, because such data
often do compress well, and might therefore require fewer “overflow” pages as described in Section 3.3.2.2, “ Compressing
BLOB, VARCHAR and TEXT columns ”.
Note that compression is specified on a table-by-table basis. All indexes of a table (including the clustered index) will be
compressed using the same page size, as specified on the CREATE TABLE or ALTER TABLE command. Table attributes
such as ROW_FORMAT and KEY_BLOCK_SIZE are not part of the CREATE INDEX syntax, and are ignored if they are
specified (although you will see them in the output of the SHOW CREATE TABLE command).
3.2.1. Configuration Parameters for Compression
Compressed tables are stored in a format previous versions of InnoDB cannot process. Therefore, to preserve downward
compatibility of database files, compression can be specified only when the “Barracuda” data file format is enabled using
the configuration parameter innodb_file_format.
Furthermore, table compression is not available for the InnoDB system tablespace. The system tablespace (space 0, the
ibdata* files) may contain user data, but it also contains internal InnoDB system information, and therefore is never
compressed. Thus, compression applies only to tables (and indexes) stored in their own tablespaces.