Inspecting MySQL ServersWhen I joined the Percona Support team in early 2013, MySQL 5.6 was being launched. I remember all the buzz and euphoria surrounding its release: it was a big step from the 5.5 series, carrying important changes in the MySQL architecture and internals that brought highly anticipated features such as the (still incipient) multi-threaded replication and online DDLs. What was not necessarily evident to the general public at the time was the focus it had on highly concurrent environments: MySQL 5.6 was targeting the future. What that meant in practice was, for many workloads that were still common at the time, 5.6 could perform worse than 5.5: it started pouring support tickets from unhappy customers that did the upgrade and were looking for assistance on this front.

It was a steep learning curve for me. Luckily – or as expected – I was surrounded by experts: Percona was already at the forefront of MySQL development and innovation, and my colleagues from the Percona Support team are some of the best in the industry. I was taught how to approach the analysis and troubleshooting of a MySQL server using a tried-and-tested method supported by tools crafted and honed by these same experts to make this task more effective. Fast forward eight years and we continue to operate with an improved version of this same method, which I’m going to present to you in this series of blog posts.

Where Do I Start?

One could argue that “it depends” but that is not the case when what you are aiming to do is analyze the performance of a MySQL server – or any server for that matter. You start by getting to know the server under which the service is running and how it behaves under load.

We can, then, break this approach into two distinct parts:

  1. Getting to “know” the server
  2. Understanding how the server behaves under load

The secret to starting on this road well is obtaining the right data. The first part can be considered more of a “static” analysis: we are looking for information about the underlying server’s (a) hardware specifications, (b) general OS settings, as well as (c) MySQL-specific configuration. It doesn’t matter much when this data is gathered as it should remain unchanged in the short term.

For the second part, however, the timing matters: data must be gathered either at a precise moment (such as while experiencing a problem) or covering a target time frame. In the case of a general MySQL server assessment, the time window must include peak time, when the load in the server is at its highest level. Usually, we want MySQL tuned in such a way that it is able to handle traffic during the busiest period of the day/week. If it can do that then it should be able to handle the more ordinary workload without breaking a sweat. On the other hand, when we are troubleshooting a specific scenario or circumstance, we need to capture the data at the moment the issue is manifesting itself. We will discuss this further later in this article.

Knowing the Server

The scenario is a common one, I believe all of you have been there before. A colleague or maybe a customer approaches you looking for some help with a server that is not performing well, one that you have never seen before. It might be a common web server running Apache and basic PHP, a more complex application server running Ruby on Rails or a full Java stack, or yet a dedicated database server running MySQL, PostgreSQL, or MongoDB.

To start with, this box where the service is running on, what kind of server is this? How much memory is available? How many cores? What about storage, is it fast, is there enough disk space? Is it running on CentOS, Ubuntu, or something else? Is it a VM?! Which kernel version? Is it swapping … is there any swap space configured at all?

You can find the answers to those questions by connecting to the server and looking around, running various specific commands to build the initial picture of the server, akin to building a puzzle from pieces you collect here and there. Do this a few times, though, and you will get bored and probably forget to collect a piece of information (or two). This is the kind of procedure that can be scripted and that has been done already by clever experts in this field, through different tools. The one we use and maintain is pt-summary, which is part of the Percona Toolkit. If you have one of the Percona repositories configured, you can install the toolkit with a simple:

I will show you how to use it in a moment.

What MySQL?

What about the MySQL part? When it comes to inspecting a MySQL server, there is some key information that is helpful to obtain upfront: what distribution and version, if it is a replica, or maybe a PXC/Galera node, the size of the Buffer Pool, redo log space, table cache, temporary tables, …

There is another tool in the Percona Toolkit specific for this, pt-mysql-summary (and if you are wondering, yes, there is also a pt-pg-summary and pt-mongodb-summary, but those are subjects for another blog series).

First Impressions

With the information provided by pt-summary and pt-mysql-summary, we can now get a more clear picture of what that server is and start to get a sense of how MySQL is configured in view of the host server, of how the database settings fit in the available physical resources. It then becomes easier, for example, to contrast the Buffer Pool size with the available memory in the server and ponder about how much memory the mysqld process is using, or yet spot a clearly misconfigured/unbalanced setting. Spend a few years doing this and you may start relating a given server version associated with a couple of specific settings values to a “popular” bug you remember the ID by heart… let’s just hope you haven’t got that far already.

The combination of the information obtained from the “pt-summaries” provides what I call a “static” view of the server: it helps us draw our first impressions but it is akin to looking at a still picture of an engine. How does it perform in motion?

An Engine in Motion

The performance of a server is dependent on the physical resources available as well as the workload in play, with the workload in the case of a database server, in a simplified view, being a combination of queries and concurrency. Some workloads are CPU-bound while others are more IO-intensive. Some workloads don’t change much over time, some follow a distinct pattern over the day/week/month/season. It’s very rare to come across a workload that is completely unpredictable, and when that happens it is usually a temporary scenario.

This discourse about workloads occupies a central point here for a reason: we tune a server for the workload it runs and not merely for the specs it has. But how do we do this? We start by observing the server operating.

This observation can be divided into two large categories:

  • at the operating system level, we can observe how the physical resources (CPU, memory, storage, network) are being used with tools such as mpstat, iostat, vmstat, and netstat;
  • at the database level, we can observe how MySQL is performing through a collection of queries such as the popular SHOW ENGINE INNODB STATUS and SHOW PROCESSLIST but also others that will cover, for instance, status variables and transaction and mutex-specific information.

Gathering all this data manually would be impractical and very difficult to coordinate, a chance there is pt-stalk, another tool from the Percona Toolkit, that automates this whole process.

Percona Support Favorite Triad

There you have it: a combination of data collected from pt-summary, pt-mysql-summary, and pt-stalk provides a good insight into a MySQL server. In the following articles, I will walk you through how we interpret the data provided by each of these tools as well as how to correlate them to understand how MySQL is performing. Once you understand how this works, the last post in this series will show you how you can obtain much of the same information (and sometimes more) from PMM. For now, I’ll leave you with a simple procedure to run them, which should be a handy reference for you to have:

NOTES:

1) By default, pt-stalk will save the captured data under /var/lib/pt-stalk. We prefer to collect everything in a directory named after the server’s hostname so we employ the variable PTDEST to define one above, which you may adjust to your liking.

2) If your system is configured in such a way that you don’t need to provide credentials to access MySQL then you can remove the

section from the pt-mysql-summary and pt-stalk commands above.

3) You can compress the target directory PTDEST and keep a copy of the data archived somewhere or share with a colleague or yet your favorite Support team 😉

Capturing Data on a Server Without the Percona Toolkit

We still need the triad of tools for this to work, the difference here is that if the Percona Toolkit is not installed on the target database server we can still download the tools in separate and execute a modified version of the above procedure as follows:

In the Next Post …

What is the most relevant information about your server you can find in a pt-summary report and how does it relate to database performance. Stay tuned!

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments