Percona Toolkit Documentation, Release 3.2.0
DESCRIPTION
pt-archiver is the tool I use to archive tables as described in http://tinyurl.com/mysql-archiving. The goal is a
low-impact, forward-only job to nibble old data out of the table without impacting OLTP queries much. You can insert
the data into another table, which need not be on the same server. You can also write it to a file in a format suitable for
LOAD DATA INFILE. Or you can do neither, in which case it’s just an incremental DELETE.
pt-archiver is extensible via a plugin mechanism. You can inject your own code to add advanced archiving logic
that could be useful for archiving dependent data, applying complex business rules, or building a data warehouse
during the archiving process.
You need to choose values carefully for some options. The most important are --limit, --retries, and
--txn-size.
The strategy is to find the first row(s), then scan some index forward-only to find more rows efficiently. Each subse-
quent query should not scan the entire table; it should seek into the index, then scan until it finds more archivable rows.
Specifying the index with the ‘i’ part of the --source argument can be crucial for this; use --dry-run to examine
the generated queries and be sure to EXPLAIN them to see if they are efficient (most of the time you probably want
to scan the PRIMARY key, which is the default). Even better, examine the difference in the Handler status counters
before and after running the query, and make sure it is not scanning the whole table every query.
You can disable the seek-then-scan optimizations partially or wholly with --no-ascend and --ascend-first.
Sometimes this may be more efficient for multi-column keys. Be aware that pt-archiver is built to start at the
beginning of the index it chooses and scan it forward-only. This might result in long table scans if you’re trying to
nibble from the end of the table by an index other than the one it prefers. See --source and read the documentation
on the i part if this applies to you.
Percona XtraDB Cluster
pt-archiver works with Percona XtraDB Cluster (PXC) 5.5.28-23.7 and newer, but there are three limitations you
should consider before archiving on a cluster:
Error on commit
pt-archiver does not check for error when it commits transactions. Commits on PXC can fail, but
the tool does not yet check for or retry the transaction when this happens. If it happens, the tool will die.
MyISAM tables
Archiving MyISAM tables works, but MyISAM support in PXC is still experimental at the time of
this release. There are several known bugs with PXC, MyISAM tables, and AUTO_INCREMENT
columns. Therefore, you must ensure that archiving will not directly or indirectly result in the use of
default AUTO_INCREMENT values for a MyISAM table. For example, this happens with --dest if
--columns is used and the AUTO_INCREMENT column is not included. The tool does not check for
this!
Non-cluster options
Certain options may or may not work. For example, if a cluster node is not also a slave, then
--check-slave-lag does not work. And since PXC tables are usually InnoDB, but InnoDB doesn’t
support INSERT DELAYED, then --delayed-insert does not work. Other options may also not
work, but the tool does not check them, therefore you should test archiving on a test cluster before archiv-
ing on your real cluster.
14 Chapter 3. pt-archiver