Continuous integration of new features and bug fixes is great – but what if a small change in seemingly insignificant code causes a major performance regression in overall server performance?

We need to ensure this does not happen.

That said, performance regressions can be hard to detect. They may hide for some time (or be masked), and once discovered it may be hard to locate the offending commit. Asking developers to execute a local Sysbench run against each bit of modified code is not really a solution either: even small changes in hardware, OS or server settings will often directly affect the test outcome. There is usually no reliable “yesterday’s run” to compare against.

So, what we need is a stable baseline to compare against. How to achieve this? The clear answer here is first and foremost: “dedicated hardware”, and also a fully automated system with fixed settings etc.. In other words, having a stable system (both in hardware and in automation) will allow for a stable (and therefore reliable) baseline.

Thanks to internal investments within Percona, we now have exactly such automated performance regression monitoring in place:

Percona Server Live Performance Regression Checks Using Sysbench
Live Percona Server Performance Regression Checks Using Sysbench

While systems are being optimized, the above is a live sneak preview of the automatic Percona Server performance regression setup on Percona’s public Jenkins instance. If you want even more previews, look here and here.

As you may have noticed, we now have nightly binaries available for Percona Server. As soon as the nightly binaries are build by Jenkins, a downstream Jenkins project starts verifying the performance of the current build.

Technical details: two Sysbench runs (OLTP read only & OLTP read/write) are executed for 15 minutes each. We then extract the total number of queries executed during each run and plot them (using the Jenkins Plot Plugin) into a line chart. There is also a stacked area chart available (scroll down here) which combines both numbers to give a better “overall” idea.

Scripting further compares the results from the current run against the previous run, and warns Percona’s QA department when there is a significant increase or decrease in performance. Whenever we see a significant decrease in performance, we will check exactly which commit caused the issue and eliminate the problem.

If you’ve been reading attentively, you will have noticed that we don’t only check for performance decreases, but also for increases. Why?

Sometimes an inadvertent result of a particular codechange (in for instance complex Optimizer or XtraDB code) may lead to interesting performance gains. Analysis of the same may lead to further performance gain opportunities.

Further planned improvements: in the future we will be looking at expanding the current setup so that performance checks are done for each commit instead of once a day. Comparing against a GA or RC baseline is another option. We are also planning to check Percona XtraDB Cluster in a 3 VM node setup for performance regressions.

Stay Tuned!