By: Luke

Luke — Fri, 12 Jan 2024 03:16:49 +0000

I have a similar senario, my concern here is that the SELECT pt-archiver generated can not guarantee the uniqueness and correctness. SELECT /*!40001 SQL_NO_CACHE */ id,k,c,pad FROM dbtest.sbtest1 FORCE INDEX(k_1) WHERE (k < 1500000) AND ((k >= ?)) ORDER BY k LIMIT 1000 Indeed, if you archive with deletion the original rows, it will be fine, but if you give --no-delete, I suspect the SELECT will archive duplicated rows, e.g. there are 1100 rows with k=100, in the first run, it will pick 1000 rows, with the last k still be 100, in the next run: SELECT /*!40001 SQL_NO_CACHE */ id,k,c,pad FROM dbtest.sbtest1 FORCE INDEX(k_1) WHERE (k < 1500000) AND ((k >= 100)) ORDER BY k LIMIT 1000 There is no guarantee MySQL will not pick the rows already fetched in the first run. In fact I think the correct SELECT should be like: SELECT /*!40001 SQL_NO_CACHE */ id,k,c,pad FROM dbtest.sbtest1 FORCE INDEX(k_1) WHERE (k < 1500000) AND ((k = 100 and id>?) or(k>100) ) ORDER BY k,id LIMIT 1000 The above query can both use the index k_1 and guarantee there is no duplicate rows will be picked.

By: Jean-François Gagné

Jean-François Gagné — Thu, 27 Oct 2022 16:33:53 +0000

I actually think doing things in Primary Key order is the right thing to do. But not the way pt-archiver is doing it currently. By deleting using the index, there are probably a lot of random PK lookups done in the table, and if deleting more than one row per InnoDB page, it might end-up loading this page many times in the buffer pool, which might waste IOs. Obviously, this will not be the case if you only delete a small percentage of the rows, but when deleting by a secondary index, there is a horrible worse case scenario of loading the table many times. By deleting things in PK order, the behavior is predictable (always reading the full table once), which is a better worse case than reading the table N times (let's say 16 times if there are 16 rows to delete in each page and this is done in random order with the table not fitting in RAM --> a 1 TB table might end-up being read 16 times !). But when deleting things in PK order, we need to be sure things are "bounded". Below query can read up to 4.000.000 rows (4M) if all rows from id 0 to 4M exist and have k > 1.500.000 (1.5M). This is very close to unbounded.

SELECT /*!40001 SQL_NO_CACHE */ id FROM dbtest.sbtest1 FORCE INDEX(PRIMARY) WHERE (k < 1500000) AND (id < '4000000') ORDER BY id LIMIT 1000

The right way of doing this is first extracting a bounded range (100K rows in this case, which is 40 times less than above and should be done in constant time, we start with :start_of_range being 0, then the next one will be the id returned by this query , until it is null, in which case we need a last pass on max(id)):

SELECT /*!40001 SQL_NO_CACHE */ id as end_of_range FROM dbtest.sbtest1 FORCE INDEX(PRIMARY) WHERE (id >= :start_of_range) ORDER BY id LIMIT 1 OFFSET 100000

Then in that range, delete all rows with k < 1.5M by doing below many times, deleting 1K row max each time:

DELETE FROM dbtest.sbtest1 WHERE (id >= :start_of_range) AND (id < :end_of_range) AND (k < 1500000) LIMIT 1000

And then grabbing the next range. The size of the range should be such as it stays in the buffer pool while the deletes are done. Note that the described algorithm is very lean in the number of id fetch client-side (only begin and end range), compared to what other queries are doing (fetching large chunks of id). Also note that the described algorithm can still be improved, because the delete might scan deleted-marked rows many times (in reference to purge being behind). Optimizing this is left as an exercise to the reader (hint: it involves a SELECT before the DELETE and narrowing-down the range as delete progresses).

By: Fernando Mario

Fernando Mario — Thu, 27 Oct 2022 12:15:58 +0000

Well done, Mike

Comments on: pt-archiver Misbehaving With Secondary Index

By: Luke

By: Jean-François Gagné

By: Fernando Mario