Inspecting MySQL Servers PMMIn the previous posts of this series, I presented how the Percona Support team approaches the analysis and troubleshooting of a MySQL server using a tried-and-tested method supported by specific tools found in the Percona Toolkit:

Inspecting MySQL Servers Part 1: The Percona Support Way

Inspecting MySQL Servers Part 2: Knowing the Server

Inspecting MySQL Servers Part 3: What MySQL?

Inspecting MySQL Servers Part 4: An Engine in Motion

A drawback from such an approach is that data collection is done in a “reactive” way and (part of) it needs to be processed before we can interpret it. Enters Percona Monitoring and Management (PMM): PMM continually collects MySQL status variables and plots the metrics in easy-to-interpret Grafana graphs and panels. Plus, it includes a rich Query Analytics dashboard that helps identify the top slow queries and show how they are executing. It makes for an excellent complement to the approach we presented. In fact, many times it takes the central role: we analyze the data available on PMM and, if necessary, look at complementing it with pt-stalk samples. In this post, I will show you how we can obtain much of the same information we got from the Percona Toolkit tools (and sometimes more) from PMM.

* As was the case in the previous posts in this series, data, and graphs used to illustrate this post does not come from a single server and have been captured using different versions of PMM.

Know the Server

Once you are connected to PMM, you can select the target server under the Node Name field in the menu located on the top-left side of the interface, then select PMM dashboards on the left menu, System (Node), and, finally, Node Summary, as shown in the screenshot below:

PMM Dashboard

The header section of the Node Summary page shows the basic hardware specs of the server as well as a few metrics and projections. You will find on the right side of this section the full output of pt-summary, which we have scrutinized extensively in the second post of this series, there, waiting for you:

MySQL Node Summary
Below the header section, there are four panels dedicated to CPU, Memory, Disk, and Network, each containing graphics with specific metrics on each of these areas. It makes it easy, for example, to look at overall CPU utilization:

CPU utilization

Recent spikes in I/O activity:

And memory usage:

Note the graphs cover the last 12 hours of activity by default but you can select a different time range in the top-right menu:

What MySQL?

Taking a slightly different route by selecting MySQL instead of System (Node) and then MySQL Summary, we get to access a dashboard that displays MySQL-specific metrics for the selected instance:

MySQL Summary

Under the Service Summary panel, you will find the full output of pt-mysql-summary, which we reviewed in detail in the third post of this series:

The main goal of the pt-mysql-summary is to provide a sneak-peek into how MySQL is configured, at a single point in time.  With PMM you get instant access to most of the MySQL trends and status variables we only get a glance from in the report. We can go and look straight under the hood to look at the engine characteristics while it is under load, over the last 5 minutes to the last 30 days or more!

An Engine in Motion

There is so much we can look at at this point. If we go and more or less follow the sequence observed in the previous posts we can start by checking if the table cache is big enough. The example below shows it to be just right, if we base ourselves in the limited time frame this particular sample covers, with an average hit ratio close to 100%:

MySQL Table Open Cache Status

Or we can look for a disruption in the pattern, such as a peak in threads connected:

And then investigate the effects it caused on the server (or was it already a consequence of something else that occurred?), for example, a change in the rate of temporary tables created at that time for both in-memory and on-disk tables:

The MySQL Instance Summary is just one of many dashboards available for MySQL:

Under the MySQL InnoDB Details dashboard we find many InnoDB-specific metrics plotted as a multitude of different graphs, providing a visual insight into things such as the number of requests that can be satisfied from data that is already loaded in the Buffer Pool versus those that must be first read from disk (does my hot data fit in memory?):

InnoDB Buffer Pool Requests

Besides MySQL status variables metrics, there is also data filtered directly from SHOW ENGINE INNODB STATUS. For instance, we can find long-running transactions based on increasing values of InnoDB’s history length list:

Another perk of PMM is the ability to easily evaluate whether redo log space is big enough based on the rate of writes versus the size of the log files:

And thus observe checkpoint age, a concept that is explained in detail for PMM in How to Choose the MySQL innodb_log_file_size:

Another evaluation made easy with PMM is whether a server’s workload is benefitting from having InnoDB’s Adaptive Hash Index (AHI) enabled. The example below shows an AHI hit-ratio close to 100% up to a certain point, from which the number of searches increased and the situation inverted:

The evaluation of settings like the size of the redo log space and the efficiency of AHI should be done at a macro level, spanning days: we should be looking for what is the best general configuration for these. However, when we are investigating a particular event, it is important to zoom in on the time frame where it occurred to better analyze the data captured at the time. Once you do this, change the data resolution from the default of auto to 1s or 5s interval/granularity so you can better see spikes and overall variation: 

QAN: Query Analytics

Query analysis is something I only hinted at but didn’t explore in the first articles in this series. The “manual” way requires processing the slow query log with a tool such as pt-query-digest and then going for details about a particular query by connecting to the server to obtain the execution plan and schema details. A really strong feature of PMM is the Query Analytics dashboard, which provides a general overview of query execution and captures all information about it for you. 

The example below comes from a simple sysbench read-write workload on my test server:

PMM Query AnalyticsWe can select an individual query on the list and check the details of its execution:

The query’s  EXPLAIN plan is also available, both in classic and JSON formats:

You can read more about QAN on our website as well as in other posts on our blog platform, such as How to Find Query Slowdowns Using Percona Monitoring and Management.

What PMM Does Not Include

There remains information/data we cannot obtain from PMM, such as the full output of SHOW ENGINE INNODB STATUS. For situations when obtaining this information is important, we resort back to pt-stalk. It is not one or the other, we see them as complementary tools in our job of inspecting MySQL servers. 

If you are curious about PMM and would like to see how it works in practice, check our demo website at https://pmmdemo.percona.com/. To get up and running with PMM quickly, refer to our quickstart guide.

Tuning the Engine for the Race Track

There you have it! It certainly isn’t all there is but we’ve got a lot packed in this series, enough to get you moving in the right direction when it comes to inspecting and troubleshooting MySQL servers. I hope you have enjoyed the journey and learned a few new tricks along the way 🙂

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments