Comments on: Capture database traffic using the Performance Schema

By: ku

ku — Tue, 30 May 2017 04:56:51 +0000

Hello, I am using Workbench 6.3.9 and I can’t see prepared statements using PerformanceReports->HighCostSqlStatements->StatementsAnalysis. Is this feature or on purpose?
Jakub

By: williamxlr

williamxlr — Wed, 28 Oct 2015 14:13:06 +0000

Daniel muchas gracias por este recurso.

By: Baron Schwartz

Baron Schwartz — Tue, 20 Oct 2015 00:52:59 +0000

Percona benchmarked VividCortex’s overhead versus the Performance Schema a few weeks ago. Using libpcap was not a “lot” more overhead (unless perhaps you do it blindly instead of pushing a packet filter into the kernel to capture only the packets needed, which VividCortex does). Crucially, however, they benchmarked with Performance Schema _idle_ and that is not realistic – in reality you’re going to be querying it frequently as shown in this blog post, and that will add overhead. In general, VividCortex is not higher overhead than P_S, there is just no free lunch.

Your example of finding queries that use large amounts of memory temp tables is good, but we can do the same thing with VividCortex. In fact we’re not limited to the things that are instrumented in Performance Schema, we can do it for arbitrary metrics, which need not even be from within MySQL! See this blog post that explains the principle at work (although obliquely): https://www.vividcortex.com/blog/2015/10/15/end-to-end-traces/

By: SuperQ

SuperQ — Wed, 14 Oct 2015 17:11:25 +0000

The downside of VividCortex is that it doesn’t know anything about what’s going on inside MySQL.

Performance Schema tells you a lot more information. For example with performance_schema table_io_waits_summary_by_table and events_statements_summary_by_digest I was easily able to discover a performance problem where a query pattern was causing large amounts of memory temp tables to be used.

Perf schema is also a lot less system overhead, since you don’t need to attempt to pcap everything the server is doing.

By: Baron Schwartz

Baron Schwartz — Sun, 11 Oct 2015 11:27:24 +0000

An easier alternative for capturing the queries off the wire traffic (without a man-in-the-middle like a proxy) is VividCortex’s traffic analyzer. https://www.vividcortex.com/resources/network-analyzer-for-mysql/

By: Ben Kochie

Ben Kochie — Sun, 04 Oct 2015 14:48:24 +0000

Third time’s a charm, my posts seem to end up in /dev/null

There is a much better way to see what’s going on inside MySQL with the performance schema.

Prometheus[0] mysqld_exporter[1] can collect metrics from events_statements_summary_by_digest and allow you to analysis on the timeseries data.

We are monitoring about 150 percona mysql servers setup into about 25 different service clusters. One Prometheus server is able to monitor over 700k timeseries metrics and allow you to query, graph, and alert on this data in real-time.

Here’s an example of what we were graphing in Ganglia, and now what we can get from Prometheus and performance schema.

https://twitter.com/matthiasr/status/647369742714576896

[0]: http://prometheus.io/
[1]: https://github.com/prometheus/mysqld_exporter

By: SuperQ

SuperQ — Sat, 03 Oct 2015 22:24:51 +0000

Hrm, somehow my post got eaten after I submitted it.

There is a much better way to understand what’s going on inside your server. Collect events_statements_summary_by_digest and store them in a timeseries Database.

I recently completed adding this functionality to the Prometheus[0] mysqld_exporter[1]

Using this configuration I am collecting stats for every unique query digest across our fleet of 150+ percona servers, covering 25 different clusters with 50 different application services.

Here’s a great example of how we were able to upgrade from Ganglia mysql stats to prometheus metrics:

https://twitter.com/matthiasr/status/647369742714576896

The first graph comes from basic query counts data from SHOW GLOBAL STATUS, the second one is detailed per query stats. We can generate more details on the number of queries, the query latency, the number of rows examined per query, rows sent per query, etc, etc.

[0]: http://prometheus.io/
[1]: https://github.com/prometheus/mysqld_exporter

By: SuperQ

SuperQ — Sat, 03 Oct 2015 16:36:40 +0000

You can also capture traffic using events_statements_summary_by_digest, but you will need a little help.

This is where Prometheus[0] mysqld_exporter[1] come in.

With Prometheus, we can collect metrics over time and store them outside of MySQL for easy analysis. The events statements collector stores separate timeseries for the number of queries, the time used by the queries, the rows examined, sent, etc.

See this tweet for an example of the detail you get. https://twitter.com/matthiasr/status/647369742714576896

You can quickly answer “which queries are the slowest”, “which queries examine the most rows”.

There are a bunch of advantages to using timeseries data for monitoring
* We can also look back through history.
* We just collect data, we don’t need to answer all the questions ahead of time
* Prometheus doesn’t down-sample, so you have full resolution metrics for as long as you keep history.
* You can write alerting rules against the timeseries.

Here’s a simple example of an alerting rule we use:
ALERT ThreadCacheMissRateTooHigh
IF rate(mysql_global_status_threads_created[5m]) / (rate(mysql_global_status_connections[5m]) > 0) > 0.5 FOR 60m

We used to try and do this with a nagios plugin, but the plugin only looked at the current value of the counters. By examining the timeseries rate over a few min, we get a much more correct look at the state of the hit rate.

I am currently monitoring about 150 MySQL servers grouped into about 20 different clusters of masters/slaves/xtradb-clusters. This generates around 700k different metrics timeseries at 15s resolution. This is no problem for a single server running Prometheus. We use this to drive all of our status dashboards.

[0]: http://prometheus.io/
[1]: https://github.com/prometheus/mysqld_exporter

By: Daniel Guzmán Burgos

Daniel Guzmán Burgos — Fri, 02 Oct 2015 22:23:35 +0000

Hi Peter,

Thanks for sharing! I’m always happy to see different alternatives to solve a common problem. I guess this solution falls in the same category as the tcpdump packet capture, since mysql proxy also hooks from the TCP to forwards packets using MySQL network protocol. Great project and very well documented as i see on the Github repo. Now, i wonder: How does mysql-proxy behave under a high concurrency situation? The latency increase while the threads_running increase in a acceptable ratio? How do you monitor the proxy itself? Can you get the exactly same info from P_S?

By: Daniel Guzmán Burgos

Daniel Guzmán Burgos — Fri, 02 Oct 2015 21:33:10 +0000

Mark,

My intention when choosing to use pt-query-digest was to show how close to reality (and by reality i mean: “the traffic captured by the slow log file”) was the traffic collected through performance schema. This post only shows an alternative that could be useful in scenarios where you don’t have access to the server and only a user with grants to read P_S, for say one scenario.

And indeed! for summary purpose, the events_statements_summary_by_digest is perfect and -as long as there’s enough rows on the events_statements_history_long table- you probably can have more than the digest with the placeholders. But also, you probably won’t, which will make the query analysis harder, as pointed some time ago in https://www.percona.com/blog/2014/02/11/performance_schema-vs-slow-query-log/ However, still very useful!

P.S: Thanks for Sys Schema!

By: Daniel Guzmán Burgos

Daniel Guzmán Burgos — Fri, 02 Oct 2015 21:31:03 +0000

Hi Fadi El-Eter,

Absolutely! the slow log is one of the greatest options to capture traffic, but as described in the blog post, under certain circumstances it can hurt the overall performance. Think about a slave whose buffer pool is keep it warm by reproducing the read traffic from the master, something that you can do with Percona Playback https://www.percona.com/blog/2012/10/25/replaying-database-load-with-percona-playback/, instead of having running the slow_query_log with long_query_time=0 all the time (being a potential bottleneck on high concurrency, with a bunch od transactions in “cleaning up” state), you can use this alternative. It’s a scenario where you don’t need 100% the exactly same traffic.

Another example, less complicated, is track write traffic to a single table. Instead of using the slow log or the binlog files with mysqlbinlog+some filtering, you can get that data directly from this table.

Hope it helps!

By: Peter Colclough

Peter Colclough — Fri, 02 Oct 2015 15:12:29 +0000

Hi Daniel,

Interesting post, and always informative. I thought I would share this with you, as I got around a lot of the issues you have stated with mysql-proxy as a part of the solution I have detailed it here (and this is all open source… and not personal promotion.. just trying to hekp out) : http://woodygsd.blogspot.co.uk/2014/07/how-do-you-log-problem-like-mysql.html

I realise proxy is not ‘released’… but it works. Even if you run one proxy per server, just to enable this logging to happen. Essentially I wrote some custom Lua code that attaches to proxy. This gets written out to a file, then logstash pushes that to Elasticsearch, allowng Kibana to graph it. Sounds like a huge stack.. it isn’t.

Currenty he whole is in production with Avaaz (www.avaaz.org) and tracking all queries/connections through 9 servers, amounting to around 120m data items per day.

Using this I have been able to save 60m queries per day, moved 40m connections off a master onto the slaves, and found out the reasons why the DB was dragging the site down in high load, and stopped it from happening.
Essentially this system is logging every single query, and not stalling MySql one bit (ok… maybe a millisecond or two.. but nothing more.. and it is worth the extra just to get te information out). We are also able to get actual slow queries, queries by the hour/day/month… alll beautifully aggregated.
Just in the process of adding web-server logs.. so when things go awry we can actually see which web calls are causing it…. even though I say so myself.. this is way cool.

I invite you to take a look.. its using pure OpenSource code, so is free to all… just want to help the ‘struggling’ 🙂

Peter Colclough

By: Mark Leith

Mark Leith — Fri, 02 Oct 2015 08:18:48 +0000

This seems like a really convoluted and more lossy method for:

SELECT .. FROM performance_schema.events_statements_summary_by_digest ORDER BY sum_timer_wait desc;

I can’t see why you would want to do the above, if you are not ultimately interested in the raw literal values on a per statement basis and only want aggregate data..

By: Fadi El-Eter (itoctopus)

Fadi El-Eter (itoctopus) — Fri, 02 Oct 2015 06:41:03 +0000

Hi Daniel,

I wish you provided another example than to retrieve something similar to that of the slow query log. This query obviously will add some overhead and may not run in case the server is on its way to crashing. I think to get slow queries, the slow query log is by far the best option. If you need the slow queries in your application, then you can just read that file.

Also phpMyAdmin offers excellent performance statistics (along with practical recommendations) – I’m not sure why anyone would need to just do all this work himself.

Please don’t take me wrong, I think these are excellent tips, but I’m just wondering whether it’s worth it to use them in the practical world!