Comments on: 20X Faster Backup Preparation With Percona XtraBackup 8.0.33-28!

By: Satya Bodapati

Satya Bodapati — Thu, 16 Nov 2023 16:31:56 +0000

In reply to Jean-François Gagné. Good question, Jean-François Gagné. Server crash recovery doesn't have the same issue because it doesn't rely on SDI to create the dictionary. Server bootstraps complete Data Dictionary(DD), and the rollback will ask the DD engine to give the table schema for a given table_id. It is complicated for PXB to initialize the DD cache and all required server dependencies. Hence, it relies on SDI. Sure, it is possible to query information_schema.innodb_tables for table_id and tablespace_id. It would increase backup lock duration and works only with lock-ddl=ON. In the future, we want to optimize(reduce) the duration of the backup lock held, which will not fit well into the plan. During the prepare phase, we retrieve the same information via a direct b-tree scan of dictionary tables. The overhead is minimal, and there is room to make it even faster. Essentially, PXB can do the same work without the lock and without utilizing any resources on the server.

By: Jean-François Gagné

Jean-François Gagné — Wed, 15 Nov 2023 15:15:23 +0000

What is done by Percona Server for MySQL on crash recovery ? Does it have the same problem (table_id to tablespace mapping), does is also have the memory problem, or does it also load scan the data dictionary ?

An alternative solution could have been to build this mapping at the backup phase: as all ibd files are read, it could be the time to extract the mapping for later use in prepare. Does the new design has up-sides on this alternative design ?

By: Satya Bodapati

Satya Bodapati — Wed, 02 Aug 2023 15:36:02 +0000

In reply to Frederic Descamps. Hi Frederic, Thank you for reading the post! 1) Before redo apply phase, the space_id->file_name is established. So yes, there are two *.ibd scans before. a) Scan 1 establishes the space_id->file_name relation. This is required for redo to be applied. At this scan, we only look for space_id. b) Scan 2 Open ibds again, this time complete page0, page3 SDI index pages. We cannot combine these. Only After the redo apply phase is over, pages become consistent (redo might be applied on the pages we read in Scan 2). After the fix, we avoid the Scan 2 which does more IO, deserialize, CPU, memory allocations etc. We only do the deserialization only for the tables required for rollback. 2) Good question, with --lock-ddl=ON (default), there are no changes on data dictionary tables. With --lock-ddl=OFF (not recommended by xtrabackup), we haven't tested this scenario. Only DDLs that move tables across tablespaces cannot be handled well. There are other issues with --lock-ddl=OFF. 3) Here, A table is looked up by undo based on "table_id". So the relation we want to establish is the table_id -> space_id relation. Not space_id->file_name relation. This relation is already established before the redo scan.

By: Frederic Descamps

Frederic Descamps — Wed, 02 Aug 2023 15:05:53 +0000

Hello ! Nice blog post, I've some comments/questions:

Even if you have table_id -> space_id mapping, you would need to open all the ibd files to know the space_id. The saving you get is reading/processing SDI info, Right?
"It is done by scanning the B-tree pages of the data dictionary tables mysql.indexes and mysql.index_partitions" - How about the inflight transaction on these tables. IIUC, they aren't rolled back yet, right?
"How will XtraBackup know the tablespace (IBD) that contains the evicted table? It must scan every IBD again to find the evicted table." - How about creating a mapping from space_id to file name when you have already opened it once. So next open would be just looking into this map and get the file to be opened.

Cheers