Inspecting MySQL Servers Part 4: An Engine in Motion

pt-stalk MySQL Percona The combination of the information obtained from the “pt-summaries” discussed in the previous posts of this series (Part 1: The Percona Support Way, Part 2: Knowing the Server, Part 3: What MySQL?) helps us come up with the first impression of a MySQL server. However, apart from the quick glance we get at two samples of a selective group of MySQL status variables, they provide what I call a “static” view of the server, akin to looking at a still picture of an engine. We get the chance to spot some major discrepancies in the MySQL configuration in view of the available resources in the host and adjust those following some best practices but the real tuning starts by observing how the engine performs in motion.

pt-stalk is the final tool in our Percona Toolkit trifecta: we use it to collect data from a running server, from both the OS side as well as MySQL’s, so we can then analyze how it is operating. For better gains, you should run pt-stalk during a period of high activity (ideally, peak time). It can also be used to collect data while the server is undergoing a performance issue, for troubleshooting.

If you have the Percona Toolkit installed on the server, you can run pt-stalk like follows:

sudo pt-stalk --no-stalk --iterations=2 --sleep=30 -- --user=root --password=<mysql-root-pass>

1	sudo pt-stalk --no-stalk --iterations=2 --sleep=30 -- --user=root --password=<mysql-root-pass>

A few observations:

Options –no-stalk –iterations=2 –sleep=30 are used to collect two sets of diagnostic data with a 30-second interval between then; since –run-time is not defined, the default value of 30 seconds is employed thus the command above provides a 1-minute worth of data.
By default, pt-stalk will save the captured data under /var/lib/pt-stalk. You can choose a different destination directory with option –dest .
If your system is configured in such a way that you don’t need to provide credentials to access MySQL then you can remove the — –user=root –password=<mysql-root-pass> section.
Many of the data collection tools used by pt-stalk are part of the sysstat package so make sure it is installed on the server.

Let’s have a look at the data collected.

A Mix of OS and MySQL Diagnostics Data

Part of the data collected by pt-stalk is purely OS-related, which makes the tool partially useful for inspecting other kinds of Linux servers as well. The other part is purely MySQL. Here’s the list of files generated by the command above in one of my test servers – note there are two groups of files below, prefixed with timestamps set 30 seconds apart:

$ ls /var/lib/pt-stalk/
2021_05_01_12_55_30-df              2021_05_01_12_56_00-df
2021_05_01_12_55_30-disk-space      2021_05_01_12_56_00-disk-space
2021_05_01_12_55_30-diskstats       2021_05_01_12_56_00-diskstats
2021_05_01_12_55_30-hostname        2021_05_01_12_56_00-hostname
2021_05_01_12_55_30-innodbstatus1   2021_05_01_12_56_00-innodbstatus1
2021_05_01_12_55_30-innodbstatus2   2021_05_01_12_56_00-innodbstatus2
2021_05_01_12_55_30-interrupts      2021_05_01_12_56_00-interrupts
2021_05_01_12_55_30-iostat          2021_05_01_12_56_00-iostat
2021_05_01_12_55_30-iostat-overall  2021_05_01_12_56_00-iostat-overall
2021_05_01_12_55_30-log_error       2021_05_01_12_56_00-log_error
2021_05_01_12_55_30-lsof            2021_05_01_12_56_00-lsof
2021_05_01_12_55_30-meminfo         2021_05_01_12_56_00-meminfo
2021_05_01_12_55_30-mpstat          2021_05_01_12_56_00-mpstat
2021_05_01_12_55_30-mpstat-overall  2021_05_01_12_56_00-mpstat-overall
2021_05_01_12_55_30-mutex-status1   2021_05_01_12_56_00-mutex-status1
2021_05_01_12_55_30-mutex-status2   2021_05_01_12_56_00-mutex-status2
2021_05_01_12_55_30-mysqladmin      2021_05_01_12_56_00-mysqladmin
2021_05_01_12_55_30-netstat         2021_05_01_12_56_00-netstat
2021_05_01_12_55_30-netstat_s       2021_05_01_12_56_00-netstat_s
2021_05_01_12_55_30-opentables1     2021_05_01_12_56_00-opentables1
2021_05_01_12_55_30-opentables2     2021_05_01_12_56_00-opentables2
2021_05_01_12_55_30-output          2021_05_01_12_56_00-output
2021_05_01_12_55_30-pmap            2021_05_01_12_56_00-pmap
2021_05_01_12_55_30-processlist     2021_05_01_12_56_00-processlist
2021_05_01_12_55_30-procstat        2021_05_01_12_56_00-procstat
2021_05_01_12_55_30-procvmstat      2021_05_01_12_56_00-procvmstat
2021_05_01_12_55_30-ps              2021_05_01_12_56_00-ps
2021_05_01_12_55_30-slabinfo        2021_05_01_12_56_00-slabinfo
2021_05_01_12_55_30-slave-status    2021_05_01_12_56_00-slave-status
2021_05_01_12_55_30-sysctl          2021_05_01_12_56_00-sysctl
2021_05_01_12_55_30-top             2021_05_01_12_56_00-top
2021_05_01_12_55_30-trigger         2021_05_01_12_56_00-trigger
2021_05_01_12_55_30-variables       2021_05_01_12_56_00-variables
2021_05_01_12_55_30-vmstat          2021_05_01_12_56_00-vmstat
2021_05_01_12_55_30-vmstat-overall  2021_05_01_12_56_00-vmstat-overall

$ ls /var/lib/pt-stalk/

2021_05_01_12_55_30-df 2021_05_01_12_56_00-df

2021_05_01_12_55_30-disk-space 2021_05_01_12_56_00-disk-space

2021_05_01_12_55_30-diskstats 2021_05_01_12_56_00-diskstats

2021_05_01_12_55_30-hostname 2021_05_01_12_56_00-hostname

2021_05_01_12_55_30-innodbstatus1 2021_05_01_12_56_00-innodbstatus1

2021_05_01_12_55_30-innodbstatus2 2021_05_01_12_56_00-innodbstatus2

2021_05_01_12_55_30-interrupts 2021_05_01_12_56_00-interrupts

2021_05_01_12_55_30-iostat 2021_05_01_12_56_00-iostat

2021_05_01_12_55_30-iostat-overall 2021_05_01_12_56_00-iostat-overall

2021_05_01_12_55_30-log_error 2021_05_01_12_56_00-log_error

2021_05_01_12_55_30-lsof 2021_05_01_12_56_00-lsof

2021_05_01_12_55_30-meminfo 2021_05_01_12_56_00-meminfo

2021_05_01_12_55_30-mpstat 2021_05_01_12_56_00-mpstat

2021_05_01_12_55_30-mpstat-overall 2021_05_01_12_56_00-mpstat-overall

2021_05_01_12_55_30-mutex-status1 2021_05_01_12_56_00-mutex-status1

2021_05_01_12_55_30-mutex-status2 2021_05_01_12_56_00-mutex-status2

2021_05_01_12_55_30-mysqladmin 2021_05_01_12_56_00-mysqladmin

2021_05_01_12_55_30-netstat 2021_05_01_12_56_00-netstat

2021_05_01_12_55_30-netstat_s 2021_05_01_12_56_00-netstat_s

2021_05_01_12_55_30-opentables1 2021_05_01_12_56_00-opentables1

2021_05_01_12_55_30-opentables2 2021_05_01_12_56_00-opentables2

2021_05_01_12_55_30-output 2021_05_01_12_56_00-output

2021_05_01_12_55_30-pmap 2021_05_01_12_56_00-pmap

2021_05_01_12_55_30-processlist 2021_05_01_12_56_00-processlist

2021_05_01_12_55_30-procstat 2021_05_01_12_56_00-procstat

2021_05_01_12_55_30-procvmstat 2021_05_01_12_56_00-procvmstat

2021_05_01_12_55_30-ps 2021_05_01_12_56_00-ps

2021_05_01_12_55_30-slabinfo 2021_05_01_12_56_00-slabinfo

2021_05_01_12_55_30-slave-status 2021_05_01_12_56_00-slave-status

2021_05_01_12_55_30-sysctl 2021_05_01_12_56_00-sysctl

2021_05_01_12_55_30-top 2021_05_01_12_56_00-top

2021_05_01_12_55_30-trigger 2021_05_01_12_56_00-trigger

2021_05_01_12_55_30-variables 2021_05_01_12_56_00-variables

2021_05_01_12_55_30-vmstat 2021_05_01_12_56_00-vmstat

2021_05_01_12_55_30-vmstat-overall 2021_05_01_12_56_00-vmstat-overall

We won’t be examining all of these in detail here; I’ll focus on the most important ones (in general), following the order (and logic) I usually employ. As was the case in the previous posts, the excerpts of data used in the following sections do not all come from the same pt-stalk collection, not even from the same server; they are assorted from different sources to better illustrate the case at hand.

Was the Server Under Moderate or High Load When the Data was Captured?

My first stop tend to be the first top sample; there we can see reasonably clear if the server was under high load when pt-stalk was run, and for how long:

top - 11:23:16 up 14 days, 23:54,  4 users,  load average: 20.17, 17.82, 14.98
Tasks: 262 total,   1 running, 261 sleeping,   0 stopped,   0 zombie
%Cpu(s): 78.2 us,  6.9 sy,  0.0 ni, 14.2 id,  0.4 wa,  0.0 hi,  0.4 si,  0.0 st
KiB Mem : 13186303+total,   658972 free, 11101868+used, 20185376 buff/cache
KiB Swap:  8388604 total,  8215548 free,   173056 used. 20041612 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 3408 mysql     20   0  112.4g 104.9g 141832 S  1169 83.4   3761:10 mysqld
16379 root      20   0  113652   1340    608 S   6.2  0.0   0:00.01 bash
16530 root      20   0  162020   2180   1528 R   6.2  0.0   0:00.01 top
    1 root      20   0   52148   3448   1940 S   0.0  0.0   4:37.76 systemd

top - 11:23:16 up 14 days, 23:54, 4 users, load average: 20.17, 17.82, 14.98

Tasks: 262 total, 1 running, 261 sleeping, 0 stopped, 0 zombie

%Cpu(s): 78.2 us, 6.9 sy, 0.0 ni, 14.2 id, 0.4 wa, 0.0 hi, 0.4 si, 0.0 st

KiB Mem : 13186303+total, 658972 free, 11101868+used, 20185376 buff/cache

KiB Swap: 8388604 total, 8215548 free, 173056 used. 20041612 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

3408 mysql 20 0 112.4g 104.9g 141832 S 1169 83.4 3761:10 mysqld

16379 root 20 0 113652 1340 608 S 6.2 0.0 0:00.01 bash

16530 root 20 0 162020 2180 1528 R 6.2 0.0 0:00.01 top

1 root 20 0 52148 3448 1940 S 0.0 0.0 4:37.76 systemd

In the sample above, we can see the server was undergoing an increase of load, “averaging” close to 15 for the past 15 minutes, 18 for the past 5 minutes, and finally 20 in the last minute. Just remember that the load metric is dependent on the number of cores in a server. This other sample below may look scarier but it comes from a box with 192 cores; while this is an oversimplification of the metric, we could say the server was operating at (76.9/192=) 40% of its maximum processing capacity in the last minute:

top - 15:30:26 up 258 days, 21 min, 16 users,  load average: 76.90, 78.63, 80.69
Tasks: 1711 total,   1 running, 1710 sleeping,   0 stopped,   0 zombie
%Cpu(s): 23.2 us,  6.5 sy,  0.0 ni, 59.7 id, 10.0 wa,  0.0 hi,  0.7 si,  0.0 st
KiB Mem : 15847333+total,  8529432 free, 78651558+used, 78968825+buff/cache
KiB Swap: 23488102+total, 23441689+free,   464128 used. 79652627+avail Mem

   PID USER   PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 45930 mysql  20   0  652.5g 645.5g  10128 S  2163 42.7 160718:24 mysqld
 91413 apuser 20   0   34.8g   9.1g 164012 S 368.4  0.6   1037:45 java
 11450 root   20   0  173092   4332   1920 R  21.1  0.0   0:00.05 top

top - 15:30:26 up 258 days, 21 min, 16 users, load average: 76.90, 78.63, 80.69

Tasks: 1711 total, 1 running, 1710 sleeping, 0 stopped, 0 zombie

%Cpu(s): 23.2 us, 6.5 sy, 0.0 ni, 59.7 id, 10.0 wa, 0.0 hi, 0.7 si, 0.0 st

KiB Mem : 15847333+total, 8529432 free, 78651558+used, 78968825+buff/cache

KiB Swap: 23488102+total, 23441689+free, 464128 used. 79652627+avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

45930 mysql 20 0 652.5g 645.5g 10128 S 2163 42.7 160718:24 mysqld

91413 apuser 20 0 34.8g 9.1g 164012 S 368.4 0.6 1037:45 java

11450 root 20 0 173092 4332 1920 R 21.1 0.0 0:00.05 top

It is not a problem if the data was captured under low or moderate load, it just changes the way we should look at and analyze the rest of it.

We can also have a good idea of overall memory usage, and how much of it was allocated to MySQL. From there, I like to have a sneaky peek at meminfo as well; sometimes we find a server with more than a little part of the memory converted into transparent huge pages (THP):

TS 1616111316.005848775 2021-03-18 23:48:36
MemTotal:       197910740 kB
MemFree:         3300968 kB
MemAvailable:   15070884 kB
Buffers:          137580 kB
Cached:         11954976 kB
SwapCached:            0 kB
Active:         178836988 kB
Inactive:        7812608 kB
Active(anon):   174796296 kB
Inactive(anon):   162712 kB
Active(file):    4040692 kB
Inactive(file):  7649896 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
(...)
AnonHugePages:  108390400 kB

TS 1616111316.005848775 2021-03-18 23:48:36

MemTotal: 197910740 kB

MemFree: 3300968 kB

MemAvailable: 15070884 kB

Buffers: 137580 kB

Cached: 11954976 kB

SwapCached: 0 kB

Active: 178836988 kB

Inactive: 7812608 kB

Active(anon): 174796296 kB

Inactive(anon): 162712 kB

Active(file): 4040692 kB

Inactive(file): 7649896 kB

Unevictable: 0 kB

Mlocked: 0 kB

SwapTotal: 0 kB

SwapFree: 0 kB

(...)

AnonHugePages: 108390400 kB

No swap space and 55% of the total memory converted in THP doesn’t look that good.

Depicting CPU Usage and I/O

My next stop is the mpstat sample, here’s one from a 16-core server under high load:

11:23:16 AM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
11:23:17 AM  all   77.51    0.00    1.50    0.94    0.00    0.25    0.00    0.00    0.00   19.80
11:23:17 AM    0   91.92    0.00    1.01    0.00    0.00    0.00    0.00    0.00    0.00    7.07
11:23:17 AM    1   61.86    0.00    2.06    3.09    0.00    0.00    0.00    0.00    0.00   32.99
11:23:17 AM    2   67.68    0.00    3.03    1.01    0.00    0.00    0.00    0.00    0.00   28.28
11:23:17 AM    3   81.00    0.00    2.00    0.00    0.00    0.00    0.00    0.00    0.00   17.00
11:23:17 AM    4  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
11:23:17 AM    5   55.56    0.00    2.02    3.03    0.00    0.00    0.00    0.00    0.00   39.39
11:23:17 AM    6   73.74    0.00    3.03    0.00    0.00    0.00    0.00    0.00    0.00   23.23
11:23:17 AM    7   62.63    0.00    1.01    0.00    0.00    0.00    0.00    0.00    0.00   36.36
11:23:17 AM    8   56.00    0.00    3.00    3.00    0.00    1.00    0.00    0.00    0.00   37.00
11:23:17 AM    9   82.00    0.00    1.00    2.00    0.00    0.00    0.00    0.00    0.00   15.00
11:23:17 AM   10   98.02    0.00    0.00    0.00    0.00    0.99    0.00    0.00    0.00    0.99
11:23:17 AM   11   67.68    0.00    1.01    1.01    0.00    0.00    0.00    0.00    0.00   30.30
11:23:17 AM   12   79.80    0.00    2.02    0.00    0.00    0.00    0.00    0.00    0.00   18.18
11:23:17 AM   13  100.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00
11:23:17 AM   14   87.88    0.00    2.02    0.00    0.00    0.00    0.00    0.00    0.00   10.10
11:23:17 AM   15   76.24    0.00    0.99    0.99    0.00    0.99    0.00    0.00    0.00   20.79

11:23:16 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle

11:23:17 AM all 77.51 0.00 1.50 0.94 0.00 0.25 0.00 0.00 0.00 19.80

11:23:17 AM 0 91.92 0.00 1.01 0.00 0.00 0.00 0.00 0.00 0.00 7.07

11:23:17 AM 1 61.86 0.00 2.06 3.09 0.00 0.00 0.00 0.00 0.00 32.99

11:23:17 AM 2 67.68 0.00 3.03 1.01 0.00 0.00 0.00 0.00 0.00 28.28

11:23:17 AM 3 81.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 17.00

11:23:17 AM 4 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

11:23:17 AM 5 55.56 0.00 2.02 3.03 0.00 0.00 0.00 0.00 0.00 39.39

11:23:17 AM 6 73.74 0.00 3.03 0.00 0.00 0.00 0.00 0.00 0.00 23.23

11:23:17 AM 7 62.63 0.00 1.01 0.00 0.00 0.00 0.00 0.00 0.00 36.36

11:23:17 AM 8 56.00 0.00 3.00 3.00 0.00 1.00 0.00 0.00 0.00 37.00

11:23:17 AM 9 82.00 0.00 1.00 2.00 0.00 0.00 0.00 0.00 0.00 15.00

11:23:17 AM 10 98.02 0.00 0.00 0.00 0.00 0.99 0.00 0.00 0.00 0.99

11:23:17 AM 11 67.68 0.00 1.01 1.01 0.00 0.00 0.00 0.00 0.00 30.30

11:23:17 AM 12 79.80 0.00 2.02 0.00 0.00 0.00 0.00 0.00 0.00 18.18

11:23:17 AM 13 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

11:23:17 AM 14 87.88 0.00 2.02 0.00 0.00 0.00 0.00 0.00 0.00 10.10

11:23:17 AM 15 76.24 0.00 0.99 0.99 0.00 0.99 0.00 0.00 0.00 20.79

MySQL should be run by a regular, non-superuser (named mysql by default), thus all database work should be accounted for in the usr column above (which, in a dedicated server, is all about MySQL). In the sample above we see 2 cores pegged at 100% utilization; unless this sample came from an exceptional period of high load, this is a fine example of a server we should probably not have running in powersave mode (for CPU scaling governor, as seen in the 2nd post of this series) and have all cores running at full speed instead.

In the sample above we find a very good balance of load: all cores are working, and working hard. But picture a similar sample where you only see one of several cores at 100% usr load: in a replica, this is a sign replication is operating in single-thread mode. Back in the days of MySQL 5.5, and even 5.6, this was mostly the norm: we sometimes saw servers with one core running at 100% all time and replication lag continuously increasing. This was a time when replicas with less, but faster cores excelled. Multi-threaded replication appeared in MySQL 5.6 but it was limited to one database per (replication) thread; if your whole schema was organized around a single database, or if only one of the databases received most of the writes, it didn’t help much. Multi-threaded replication really blossomed in MySQL 5.7, with the introduction of the LOGICAL_CLOCK mode. The variables file contains all MySQL settings and you can use it to check how is replication configured in a replica:

$ grep "slave_parallel_" *-variables
slave_parallel_type LOGICAL_CLOCK
slave_parallel_workers 8

$ grep "slave_parallel_" *-variables

slave_parallel_type LOGICAL_CLOCK

slave_parallel_workers 8

Another important column to monitor in mpstat is the iowait column: if you see values repeatedly above 1.0, particularly under moderate load, it may be a sign that the storage subsystem is a limiting factor. Double-check in the iostat sample if the average queue size (avgqu-sz) depth is often bigger than 1:

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb               0.00     4.00    0.00 1187.00     0.00 58873.33    99.20     4.71    3.97    0.00    3.97   0.04   5.20
sda               0.00     0.00    0.00    0.33     0.00     1.33     8.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00 1187.33     0.00 58841.33    99.12     4.76    4.01    0.00    4.01   0.04   5.20

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util

sdb 0.00 4.00 0.00 1187.00 0.00 58873.33 99.20 4.71 3.97 0.00 3.97 0.04 5.20

sda 0.00 0.00 0.00 0.33 0.00 1.33 8.00 0.00 0.00 0.00 0.00 0.00 0.00

dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

dm-1 0.00 0.00 0.00 1187.33 0.00 58841.33 99.12 4.76 4.01 0.00 4.01 0.04 5.20

In the sample above, we can see that sdb is already saturated with around 1000 write IOPS. Sometimes the IO capacity of the system is overestimated and the settings are set too high, making InnoDB believe it has more IOPS at its disposal than there actually is:

$ grep “innodb_io_capacity” *-variables
innodb_io_capacity 1000
innodb_io_capacity_max 10000

$ grep “innodb_io_capacity” *-variables

innodb_io_capacity 1000

innodb_io_capacity_max 10000

It is a good practice to measure the practical limit of IOPS that can be provided by your disks. Overestimating innodb_io_capacity may actually lead to worse performance and stalls in response times. For more information on this, see Tuning MySQL/InnoDB Flushing for a Write-Intensive Workload.

The last stop in my usual tour around the OS data collected by pt-stalk is the vmstat sample, which I mostly use to check if there was some swapping taking place when the data was collected:

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
28  2 172544 676900 160176 20025276    0    0   158   243    1    1 16  1 83  0  0
29  0 172544 672692 160312 20027832  260    0  1064  3588 22249 3959 95  2  3  0  0
17  0 172544 669032 160316 20028860   52    0   348  3436 20754 5070 78  2 20  1  0
(...)
13  0 172544 697792 160536 19998792    0    0   576  3600 19421 5221 73  2 24  1  0
21  0 173056 705116 160532 19991780    0  300  1744  3936 183103 4090 86  3 10  1  0
11  0 173056 696420 160540 19999820    0   88  7360  3548 48524 4451 82  2 16  0  0
21  0 173056 692664 160540 20003536    0    0  3280  3716 20936 4711 88  2  9  1  0

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----

r b swpd free buff cache si so bi bo in cs us sy id wa st

28 2 172544 676900 160176 20025276 0 0 158 243 1 1 16 1 83 0 0

29 0 172544 672692 160312 20027832 260 0 1064 3588 22249 3959 95 2 3 0 0

17 0 172544 669032 160316 20028860 52 0 348 3436 20754 5070 78 2 20 1 0

(...)

13 0 172544 697792 160536 19998792 0 0 576 3600 19421 5221 73 2 24 1 0

21 0 173056 705116 160532 19991780 0 300 1744 3936 183103 4090 86 3 10 1 0

11 0 173056 696420 160540 19999820 0 88 7360 3548 48524 4451 82 2 16 0 0

21 0 173056 692664 160540 20003536 0 0 3280 3716 20936 4711 88 2 9 1 0

While swapping is detrimental to performance, as discussed in the 2nd post, I’m one to advocate in favor of having swap space configured in a server in most cases – just make sure to adjust swappiness to a minimum as explained there. You can check that is the case by looking at the sysctl sample:

$ grep swappiness *-sysctl
vm.swappiness = 1

1 2	$ grep swappiness *-sysctl vm.swappiness = 1

MySQL in Motion

This is my favorite part: mysqladmin files contain samples from SHOW GLOBAL STATUS and while it is difficult to make sense of the raw data, pt-mext, another tool from the Percona Toolkit, can be used to process them and provide a clear view of the status counters variation over time, using the first sample in the series as the base reference. Here’s how you can use it:

pt-mext -r -- cat 2021_05_01_12_55_30-mysqladmin > ptmext.out

1	pt-mext -r -- cat 2021_05_01_12_55_30-mysqladmin > ptmext.out

Let the fun begin!

Does My Hot Data Fit in Memory?

This is important information to have: is the InnoDB Buffer Pool big enough to store the data that is accessed more often? If it does, it means very little data is fetched from the tablespace files on the disk (which is slow to load) and that most of it is found in memory already. We get that ratio from two InnoDB status counters:

$ grep "Innodb_buffer_pool_read_requests\|Innodb_buffer_pool_reads" ptmext.out
Innodb_buffer_pool_read_requests      69496091591              4040274              3636485              3475398              3509297              3324067              2790171              3811242              3820111              4472993              4240323              3603441              4636822              4713943              4880787              4032916              4328607              3959668              4187900              3883735              3850137              4166323              3933372              3628560              4101771              2766796              3220822              4093628              3581942
Innodb_buffer_pool_reads                  9420027                   12                   17                    7                   11                   28                   21                   16                   53                   5                    4                   17                   10                    9                   14                    9                   14                    5                   30                   4                    6                    7                    4                   10                   1                  115                    8                    9                   56

$ grep "Innodb_buffer_pool_read_requests\|Innodb_buffer_pool_reads" ptmext.out

Innodb_buffer_pool_read_requests 69496091591 4040274 3636485 3475398 3509297 3324067 2790171 3811242 3820111 4472993 4240323 3603441 4636822 4713943 4880787 4032916 4328607 3959668 4187900 3883735 3850137 4166323 3933372 3628560 4101771 2766796 3220822 4093628 3581942

Innodb_buffer_pool_reads 9420027 12 17 7 11 28 21 16 53 5 4 17 10 9 14 9 14 5 30 4 6 7 4 10 1 115 8 9 56

Innodb_buffer_pool_read_requests indicates the number of (logical) read requests
Innodb_buffer_pool_reads indicates the number of (logical) read requests that could not be satisfied from data already present in the Buffer Pool; the data had to first be read from disk and then loaded into the Buffer Pool to be served.

The “magic” of pt-mext is shown above: the first value presented for each metric comes from the first sample collected, which provides a base reference, while the values for all consecutive samples, taken 1 second apart, are the difference between them and the previous sample. This way, we get a clear idea of how the metric is changing over time.

Here are the values for the first four samples of Innodb_buffer_pool_reads used in the excerpt above, to better illustrate this function:

Raw value	*pt-mext*
9420027	9420027
9420039	+12
9420056	+17
9420063	+7

Looking at Temporary Tables

Another important aspect we discussed in the last post that is key to MySQL tuning is the use of temporary tables. If there are too many of them being created every second:

Your queries may be needing some optimization;
They may be taking too much of the memory available and the situation may become critical (as in running out of memory) during peaks of high activity/concurrency.

There’s another element to it: if they are growing bigger than the threshold for temporary tables to remain in-memory:

$ grep "tmp_table_size\|max_heap_table_size" *-variables
max_heap_table_size 209715200
tmp_table_size 209715200

$ grep "tmp_table_size\|max_heap_table_size" *-variables

max_heap_table_size 209715200

tmp_table_size 209715200

they will be moved to the disk (becoming on-disk temporary tables) and your queries will take more time to complete, which may also create a vicious cycle during moments of high concurrency. Here’s a somewhat extreme example from a server with only 126G of RAM and a Buffer Pool of 60G:

Created_tmp_disk_tables              61327112            73           523           501           448           518           421           533           540           464           618           741           528           332           216           522           400           113             8             7           363           208           575           519           396           400           483           610           236
Created_tmp_tables                  134023235           115           546           525           474           542           452           550           557           493           641           763           548           356           250           558           419           129            23            19           406           234           595           542           420           419           511           637           271

1 2	Created_tmp_disk_tables 61327112 73 523 501 448 518 421 533 540 464 618 741 528 332 216 522 400 113 8 7 363 208 575 519 396 400 483 610 236 Created_tmp_tables 134023235 115 546 525 474 542 452 550 557 493 641 763 548 356 250 558 419 129 23 19 406 234 595 542 420 419 511 637 271

Created_tmp_tables indicates the total number of temporary tables created by the server, be it in-memory or on-disk.
Created_tmp_disk_tables only accounts for temporary tables created on-disk

In the excerpt above, we see a high number of temporary tables created every second, with most of them being created as or converted to on-disk temporary tables. If we consider the latter option, that they are being created in-memory until reaching the threshold set by tmp_table_size, the amount of memory allocated for temporary tables during peak time may surpass what remains available in the server. That is what was happening with this server until they downsized the buffer pool to the current value of 60G as a short term solution; despite having swappiness set to 1 and theoretically available memory in the OS cache, swap space was almost depleted due to memory pressure:

top - 15:54:29 up 126 days,  1:18,  3 users,  load average: 4.17, 4.34, 4.70
Tasks: 445 total,   4 running, 441 sleeping,   0 stopped,   0 zombie
%Cpu(s):  8.0 us,  6.3 sy,  0.0 ni, 71.8 id, 13.5 wa,  0.0 hi,  0.4 si,  0.0 st
KiB Mem : 13176490+total,  1973120 free, 10337484+used, 26416932 buff/cache
KiB Swap: 10485756 total,   562384 free,  9923372 used. 26613916 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
11830 mysql     20   0  111.4g  93.8g   8760 S 175.0 74.6  10332:00 mysqld

top - 15:54:29 up 126 days, 1:18, 3 users, load average: 4.17, 4.34, 4.70

Tasks: 445 total, 4 running, 441 sleeping, 0 stopped, 0 zombie

%Cpu(s): 8.0 us, 6.3 sy, 0.0 ni, 71.8 id, 13.5 wa, 0.0 hi, 0.4 si, 0.0 st

KiB Mem : 13176490+total, 1973120 free, 10337484+used, 26416932 buff/cache

KiB Swap: 10485756 total, 562384 free, 9923372 used. 26613916 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

11830 mysql 20 0 111.4g 93.8g 8760 S 175.0 74.6 10332:00 mysqld

procs -----------memory----------     ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache       si   so    bi    bo    in    cs us sy id wa st
18  4 9923884 1988808     40 26417080    1    1    33   584     0     0  2  1 97  0  0
 5  3 9919788 1980508     40 26424644 5556    0 17444 30576 42479 53317  4  2 82 11  0
 2  3 9912876 1970328     40 26427844 7872    0 21856 35881 50203 67962  6  2 82 11  0
 2  3 9903404 1955664     40 26433916 10992   0 24208 41452 58789 76326  6  2 82 10  0
 3  4 9893932 1938188     40 26437384 11008   0 25712 39040 54518 65092  6  2 82 11  0
 5  3 9888044 1926884     40 26442912 7776    0 27152 37513 75462 82150 17  3 71  9  0
 2  4 9881644 1917948     40 26445612 7336    0 23288 32787 44001 53016  7  2 79 12  0
 5  4 9869612 1896296     40 26450052 15308   0 30748 34702 50177 58230  9  2 77 12

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----

r b swpd free buff cache si so bi bo in cs us sy id wa st

18 4 9923884 1988808 40 26417080 1 1 33 584 0 0 2 1 97 0 0

5 3 9919788 1980508 40 26424644 5556 0 17444 30576 42479 53317 4 2 82 11 0

2 3 9912876 1970328 40 26427844 7872 0 21856 35881 50203 67962 6 2 82 11 0

2 3 9903404 1955664 40 26433916 10992 0 24208 41452 58789 76326 6 2 82 10 0

3 4 9893932 1938188 40 26437384 11008 0 25712 39040 54518 65092 6 2 82 11 0

5 3 9888044 1926884 40 26442912 7776 0 27152 37513 75462 82150 17 3 71 9 0

2 4 9881644 1917948 40 26445612 7336 0 23288 32787 44001 53016 7 2 79 12 0

5 4 9869612 1896296 40 26450052 15308 0 30748 34702 50177 58230 9 2 77 12

The long-term solution for them was to work on optimizing their queries to decrease the amount and size of temporary tables utilized by them.

Contention at the Table Cache Level

When a connection runs a query that accesses a given table, the table is opened and the Opened_tables counter is increased. If there are ten connections running the same query concurrently, the table will be opened ten times; the reference for a given opened table is not shared by different client sessions because each may have a different state set for the target table. Note this impacts the number of file descriptors used by MySQL.

For performance reasons, the reference for each opened table is kept in a cache. When that cache is full and someone tries to open a table that is not in the cache, MySQL needs to make space for it by either finding a table that is no longer being in use by any thread and recycling its entry on the table cache or by temporarily expanding the table cache.

The general advice suggests that if the value for Opened_tables increases constantly, it’s a sign that the table cache might be too small:

Opened_tables                        17369503                  123                  179                  187                  228                  307                  296                  350                  209                  208                  192                  219                  308                  183                  316                  260                  312                  291                  123                  237                  152                  146                  208                  273                  284                  335                  110                  238

1	Opened_tables 17369503 123 179 187 228 307 296 350 209 208 192 219 308 183 316 260 312 291 123 237 152 146 208 273 284 335 110 238

I rather look at the number of cache misses and overflows, which provides a more direct indication of contention at the table cache level:

Table_open_cache_hits                697915937                 5362                 7645                 6461                 4789                 4815                 5912                 6569                 5989                 7260                 4128                 5892                 7822                 4265                 7562                 5032                 6892                 6451                 2865                 5740                 5475                 3744                 5568                 6137                 4688                 5687                 1940                 3743                 
Table_open_cache_misses              17355555                  123                  179                  187                  228                  307                  296                  350                  209                  208                  192                  219                  308                  183                  316                  260                  312                  290                  123                  237                  152                  146                  208                  273                  284                  335                  110                  238                   
Table_open_cache_overflows           17344722                  123                  179                  187                  228                  307                  296                  350                  209                  208                  192                  219                  308                  183                  316                  260                  312                  290                  123                  237                  152                  146                  208                  273                  284                  335                  110                  238

Table_open_cache_hits 697915937 5362 7645 6461 4789 4815 5912 6569 5989 7260 4128 5892 7822 4265 7562 5032 6892 6451 2865 5740 5475 3744 5568 6137 4688 5687 1940 3743

Table_open_cache_misses 17355555 123 179 187 228 307 296 350 209 208 192 219 308 183 316 260 312 290 123 237 152 146 208 273 284 335 110 238

Table_open_cache_overflows 17344722 123 179 187 228 307 296 350 209 208 192 219 308 183 316 260 312 290 123 237 152 146 208 273 284 335 110 238

As discussed in the previous post, pay attention to the relative size of each table cache instance:

$ grep table_open_cache *-variables
table_open_cache 10708
table_open_cache_instances 16

$ grep table_open_cache *-variables

table_open_cache 10708

table_open_cache_instances 16

Ever since the table cache has been partitioned to perform better in highly-concurrent environments, it is no longer one big cache with table_open_cache entries: this value is now divided by the number of table_open_cache_instances. If the number of entries per instance is low, it won’t scale well; you may need to increase table_open_cache, or yet decrease the number of instances.

There are a lot of other status variables we could be covered here but I hope you have got the general idea of how we can use pt-mext to process mysqladmin samples from the examples above. We shall keep going, there are two essential pieces of data we cannot leave uncovered.

Looking Inside the (Real) Engine

And that would be InnoDB. Yes, we get a lot of InnoDB-related metrics covered in the status variables but there is precious information about the activity of the storage engine we can only find in the output of SHOW ENGINE INNODB STATUS (“SEIS”) which we find in the innodbstatus files.

There is a lot of information in there, I reckon it is a common feeling to feel a bit lost when trying to make sense of it all. The trick is to break it down into sections; you will usually focus on a subset of them, according to what you are looking for.

Transactions Running

InnoDB is a transactional storage engine and SEIS is one of the best places to look for information about them. Open one of the innodbstatus (there are two collected for each of the iterations executed) and have a general look. Before I do it myself, I like to get a general idea about the number of transactions inside InnoDB; I use grep for this:

$ grep "\---TRANSACTION" innodbstatus |wc -l
2503

1 2	$ grep "\---TRANSACTION" innodbstatus \|wc -l 2503

That’s a lot of transactions, but it doesn’t mean they are all running:

$ grep "\---TRANSACTION" innodbstatus | grep ACTIVE | wc -l
273

1 2	$ grep "\---TRANSACTION" innodbstatus \| grep ACTIVE \| wc -l 273

If you have a lot of active transactions running inside InnoDB and they are taking too much time to complete, getting just a bit of CPU cycles to process each time and then going back to waiting for more, generating a lot of context switch, you may benefit from limiting the maximum number of transactions that can be allowed inside InnoDB at the same time – the rest of them will have to wait at the door, a concept that is not much different than using a connection pooler. The variable that can be used for this is innodb_thread_concurrency: the default value of zero indicates there is no limit in place.

This is one of the best examples I have in mind when I think about “tuning the engine for the race track”: the workload dictates how MySQL should be tuned. In a high concurrency system, if there are queries that compete for the same resources blocking each other, it may be better to allow only a fraction of them to run at a time so they complete faster and leave. In cases like this, how many threads should be allowed to run concurrently is a matter of experimentation: try with values that are multiples of the number of cores in the server.

Long-Running Transactions

You will find the oldest, longest-running transactions on the bottom of the list of transactions: if you are troubleshooting a performance problem, there is a good chance that those transactions are involved – either as culprits or victims of it.

Long-running transactions can be very detrimental to general server performance: they drive the InnoDB history list length high and can hold locks needed by other queries to complete. Here’s an example of such:

---TRANSACTION 48793810, ACTIVE 293183 sec
106688 lock struct(s), heap size 10887376, 1642661 row lock(s), undo log entries 3698942
MySQL thread id 349, OS thread handle 140316247336704, query id 64765183 myserver.hostname.com 172.x.x.x myuser
Trx read view will not see trx with id >= 48793810, sees < 48793810

---TRANSACTION 48793810, ACTIVE 293183 sec

106688 lock struct(s), heap size 10887376, 1642661 row lock(s), undo log entries 3698942

MySQL thread id 349, OS thread handle 140316247336704, query id 64765183 myserver.hostname.com 172.x.x.x myuser

Trx read view will not see trx with id >= 48793810, sees < 48793810

You can investigate those further by looking for them in the respective transactions file:

*************************** 2. row ***************************
                    trx_id: 48793810
                 trx_state: RUNNING
               trx_started: 2020-08-21 00:00:33
     trx_requested_lock_id: NULL
          trx_wait_started: NULL
                trx_weight: 3805639
       trx_mysql_thread_id: 349
                 trx_query: NULL
       trx_operation_state: NULL
         trx_tables_in_use: 0
         trx_tables_locked: 11
          trx_lock_structs: 106688
     trx_lock_memory_bytes: 10887376
           trx_rows_locked: 1642663
         trx_rows_modified: 3698951
   trx_concurrency_tickets: 0
       trx_isolation_level: REPEATABLE READ
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
 trx_adaptive_hash_latched: 0
 trx_adaptive_hash_timeout: 0
          trx_is_read_only: 0
trx_autocommit_non_locking: 0

*************************** 2. row ***************************

trx_id: 48793810

trx_state: RUNNING

trx_started: 2020-08-21 00:00:33

trx_requested_lock_id: NULL

trx_wait_started: NULL

trx_weight: 3805639

trx_mysql_thread_id: 349

trx_query: NULL

trx_operation_state: NULL

trx_tables_in_use: 0

trx_tables_locked: 11

trx_lock_structs: 106688

trx_lock_memory_bytes: 10887376

trx_rows_locked: 1642663

trx_rows_modified: 3698951

trx_concurrency_tickets: 0

trx_isolation_level: REPEATABLE READ

trx_unique_checks: 1

trx_foreign_key_checks: 1

trx_last_foreign_key_error: NULL

trx_adaptive_hash_latched: 0

trx_adaptive_hash_timeout: 0

trx_is_read_only: 0

trx_autocommit_non_locking: 0

processlist file:

*************************** 1. row ***************************
           Id: 349
         User: myuser
         Host: myserver.hostname.com:60260
           db: mydatabase
      Command: Sleep
         Time: 1
        State:
         Info: NULL
    Rows_sent: 0
Rows_examined: 0

*************************** 1. row ***************************

Id: 349

User: myuser

Host: myserver.hostname.com:60260

db: mydatabase

Command: Sleep

Time: 1

State:

Info: NULL

Rows_sent: 0

Rows_examined: 0

and lock-waits file:

*************************** 1. row ***************************
   who_blocks: thread 349 from myserver.hostname.com:60260
  idle_in_trx: 1
max_wait_time: 49
  num_waiters: 10

*************************** 1. row ***************************

who_blocks: thread 349 from myserver.hostname.com:60260

idle_in_trx: 1

max_wait_time: 49

num_waiters: 10

I reckon I’m stopping short of providing a complete procedure for investigating locking issues; this is another subject that warrants more attention.

Checkpointing and Redo Log Space

We briefly discussed the importance of tuning the redo log space (innodb_log_file_size x innodb_log_files_in_group) in the previous post, and how it should be big enough to optimize the checkpointing mechanism. There is an old formula that can still be used for estimation, and the data for this can be found in SEIS. If we used the first innodbstatus file of each of the two samples:

$ grep "Log sequence number" 2020_08_24_15_*-innodbstatus1
2020_08_24_15_30_25-innodbstatus1:Log sequence number 6645<strong>2632416280</strong>
2020_08_24_15_30_55-innodbstatus1:Log sequence number 6645<strong>3484080767</strong>

$ grep "Log sequence number" 2020_08_24_15_*-innodbstatus1

2020_08_24_15_30_25-innodbstatus1:Log sequence number 6645<strong>2632416280</strong>

2020_08_24_15_30_55-innodbstatus1:Log sequence number 6645<strong>3484080767</strong>

We can obtain the number of bytes written in this 30-second space and make a rough projection for one hour:

(66453484080767 - 66452632416280) * 2 * 60 ~= 95G/h

1	(66453484080767 - 66452632416280) * 2 * 60 ~= 95G/h

Such a projection is very dependent on the timing when the SEIS samples were captured; you may want to track Log sequence number during an extended period of time but do note it is important that the redo log be tuned for the most demanding write periods, which usually takes place during peak time. You certainly don’t need to have it covered for a full hour; 15-30 minutes should be enough for checkpointing to operate in an optimized way:

20min ~= ( innodb_log_file_size → 16G ) x ( innodb_log_files_in_group → 2 )

1	20min ~= ( innodb_log_file_size → 16G ) x ( innodb_log_files_in_group → 2 )

Adaptive Hash Index

InnoDBs can make use of a special feature to index values that are being accessed frequently, called Adaptive Hash Index, or AHI, for short. As the name suggests, these are built as hash indexes in memory, on top of the B-Tree table indexes we usually talk about. They allow for very fast hashed lookups. But it won’t be helpful in workloads where most of the searches are non-hash searches:

$ grep "hash searches" innodbstatus
19000.00 hash searches/s, 1955000.00 non-hash searches/s

1 2	$ grep "hash searches" innodbstatus 19000.00 hash searches/s, 1955000.00 non-hash searches/s

In the excerpt above, we find that only 1% of the searches are “hashable”: in cases like this, disabling the AHI feature (innodb_adaptive_hash_index) can improve performance. AHI will have a bigger impact (be it positive or negative) on read-only workloads so it may make sense to have it enabled on your replicas but not on the master – or vice-versa. This highlights another interesting point when inspecting database servers: it is very likely you will find different variations of your workload if you split traffic into writes (master) and reads (replicas); you should analyze (and configure) your servers independently.

In other cases, the rate of hashable searches can be predominant but the system may still suffer contention from this mechanism. This used to be the case in MySQL 5.6 and earlier versions, when “the adaptive hash index search system was protected by a single latch (btr_search_latch) which could become a point of contention”. This is apparent when we see many threads waiting on an RW-latch created in btr0sea.c in the upper part of the SEIS output, for example. This has been mitigated in MySQL 5.7 with the possibility of partitioning the AHI search mechanism (innodb_adaptive_hash_index_parts): “each index is bound to a specific partition, with each partition protected by a separate latch”.

If you are in doubt about the effectiveness of the AHI for your workload and can test operating with it disabled for comparison, that would be a simple path to pursuit.

A Quick Note on Processlist

We mentioned processlist samples briefly above: they contain basic information about ongoing connections to MySQL, such as what their source is and what they are currently doing, although not always in exact terms. There are moments when we need to inspect this file in search of a specific thread like we did earlier, but we can also extract more high-level stats from these samples. The following command will sort connections by Command:

$ cat processlist|grep Command:|sort|uniq -c|sort -rn
   2750 Command: Sleep
   1226 Command: Query
     36 Command: Connect
     29 Command: Binlog Dump
      2 Command: Execute

$ cat processlist|grep Command:|sort|uniq -c|sort -rn

2750 Command: Sleep

1226 Command: Query

36 Command: Connect

29 Command: Binlog Dump

2 Command: Execute

Please note you cannot simply look at the overall raw numbers above: each processlist file contains 29 samples of SHOW FULL PROCESSLIST; averaging these values does not provide an accurate view of the system at the time, but it does provide a pretty good general representation of it. Sorting by State is even better in this sense:

$ cat processlist|grep State:|sort|uniq -c|sort -rn
   2750   State: 
    548   State: Opening tables
    369   State: statistics
    103   State: Sending data
     68   State: starting
     31   State: query end
     31   State: Creating sort index
     29   State: Waiting for the slave SQL thread to free enough relay log space
     27   State: update
     18   State: executing
     17   State: Master has sent all binlog to slave; waiting for more updates
     14   State: Sending to client
      7   State: updating
      6   State: Receiving from client
      6   State: preparing for alter table
      5   State: checking permissions
      4   State: Waiting for table metadata lock
      4   State: closing tables
      3   State: System lock
      1   State: Sending binlog event to slave
      1   State: preparing
      1   State: creating table

$ cat processlist|grep State:|sort|uniq -c|sort -rn

2750 State:

548 State: Opening tables

369 State: statistics

103 State: Sending data

68 State: starting

31 State: query end

31 State: Creating sort index

29 State: Waiting for the slave SQL thread to free enough relay log space

27 State: update

18 State: executing

17 State: Master has sent all binlog to slave; waiting for more updates

14 State: Sending to client

7 State: updating

6 State: Receiving from client

6 State: preparing for alter table

5 State: checking permissions

4 State: Waiting for table metadata lock

4 State: closing tables

3 State: System lock

1 State: Sending binlog event to slave

1 State: preparing

1 State: creating table

By looking at the excerpt above, I would be inclined to further examine the efficiency of the table cache and that of the redo log space.

One last point that almost didn’t make this document: the MyISAM storage engine has its days counted. In MySQL 8.0, the move was made that even system tables are not MyISAM anymore. In certain situations, you may resort back to MyISAM for on-disk temporary tables (default_tmp_storage_engine). Just have the following information in mind: for obvious reasons, MyISAM tables aren’t featured in the output of SHOW ENGINE INNODB STATUS: if you have a performance issue involving a query and you can’t find anything about it in SEIS, chances are that the underlying tables are not InnoDB; look for clues about the query in SHOW PROCESSLIST instead.

Daemon Mode

You can have pt-stalk running in the background as a daemon, ready to trigger when a predefined condition is met. By default, a threshold for Threads_running is used, but you can also employ your own trigger conditions through a custom function!

In the Next Post …

The best complement to our beloved Percona Toolkit trifecta is none other than Percona Monitoring and Management (PMM). It continually collects MySQL status variables for your servers and plots the metrics in Grafana panels that are easy to interpret. Plus, it includes a very rich Query Analytics dashboard that can help you identify your top slow queries. In the final post of this series, we have a look at how we can obtain much of the same information we got from the Percona Toolkit tools (and sometimes more) from PMM.

1 Comment

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Vasanth S

1 year ago

Hi Fernando,

Very detailed and excellent explanation. Kudos. \

Regarding ‘Contention at the Table Cache level’, you’ve stated that ‘this impacts the number of file descriptors used by MySQL‘ meaning that the same table opened in multiple instances will have different file descriptors for the same file.

I tried to recreate the above scenario by executing a ‘select * from tablename’ on a table which has 18 crore records in one session and another select query ‘select * from tablename order by id desc limit 1000’ on the same table in another session. I monitored the system calls using strace. From the strace output, we found out that session2 used the same file descriptor as session1 to retrieve data while the query in session1 was still running. Please note that the value of table_open_cache_instances was 16.

If a table is open, another session accessing the same table should open the table .ibd file using a new file descriptor. But that is not the case here.

When we say increasing table_open_cache_instances reduces contention, what contention do we exactly mean here? table level? In source code, it is mentioned that by increasing instances we can make sure to lock ‘m_lock’ instead of having to lock the ‘LOCK_OPEN’. From my understanding, this lock is unlocked in a very small span of time and don’t think it causes any overhead.

Please share your insights on this. Also, what do you think should be the ideal value for table_open_cache_instances?

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Inspecting MySQL Servers Part 4: An Engine in Motion

A Mix of OS and MySQL Diagnostics Data

Was the Server Under Moderate or High Load When the Data was Captured?

Depicting CPU Usage and I/O

MySQL in Motion

Does My Hot Data Fit in Memory?

Looking at Temporary Tables

Contention at the Table Cache Level

Looking Inside the (Real) Engine

Transactions Running

Long-Running Transactions

Checkpointing and Redo Log Space

Adaptive Hash Index

A Quick Note on Processlist

Daemon Mode

In the Next Post …

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Valkey/Redis: Not-So-Good Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation