sysbench-lua for benchmark toolingSysbench has long been established as the de facto standard when it comes to benchmarking MySQL performance. Percona relies on it daily, and even Oracle uses it when blogging about new features in MySQL 8. Sysbench comes with several pre-defined benchmarking tests. These tests are written in an easy-to-understand scripting language called Lua. Some of these tests are called: oltp_read_write, oltp_point_select, tpcc, oltp_insert. There are over ten such scripts to emulate various behaviors found in standard OLTP applications.

But what if your application does not fit the pattern of traditional OLTP? How can you continue to utilize the power of load-testing, benchmarking, and results analysis with sysbench? Just write your own Lua script!

For those that want to jump ahead and see the full source, here you go.

Sysbench API

To start off, each Lua script you create must implement three core sysbench-Lua API functions. These are thread_init, thread_done, and event. You can read the comments in the code below for the meaning of each function and what is happening inside.

That’s all pretty simple and should function as a good template in your scripts. Now let’s take a look at the rest of the script.

Sanity checks and options

Now let’s get into the core code. At the top you’ll find the following sections:

The first section is a sanity check to make sure the user actually wants to run this test. Other test scripts, mentioned above, support commands like prepare, run, and cleanup. Our script only supports run as the data we are using is pre-populated by our core application.

The second section allows us, the script writer, to let the user pass some options specific to our test script. In the code above, we can see an option for the number of SELECT statements that will be ran on each thread/iteration (default is 5) and the other option allows the user to disable BEGIN/COMMIT if they so desire (default is false). If you want more customization in your script, simply add more options. You’ll see how to reference these parameters later on.

The queries

Now it is time to define the custom queries we want to execute in our script.

The above code defines several arrays/lists of different queries. Why is this necessary? Later on in the code, we will have to parse each SQL statement and populate/replace the various parameters with randomly generated values. It would not do us any good to repeat the same SELECT * FROM fooTable WHERE id = 44 every time, now would it? Certainly not. We want to generate random numbers and have our queries select from the entire dataset.

Some queries have no parameters, some have integer-based, and some string-based. We will handle these differently below, which is why they are in different arrays above. This method also allows for future expansion. When you want to run additional queries within the script, just add another line to each array; no need to change any other code.

Parse and execute

The function below, execute_selects, will be called from the parent function, event, which we discussed earlier in the post. You can see for-loops for each of the three SELECT categories we created above. The comments inline should help explain what is happening. Note the use of the user-provided option –point-selects in the second loop below, which we created previously in the ‘Sanity and Options’ section.

Two more things to mention for this code. First, you will notice the use of sysbench.rand.pareto to generate a random number between 1 and 3,000,000. For our dataset, we know that each table referenced in all queries relating to WHERE id = ? has that many number of rows, at minimum. This is specific to our data. Your values will certainly be different. Second, notice the use of sysbench.rand.string, and string.rep. The string.rep segment will generate a string comprised of ‘@’ symbols, between 2 and 15 characters long. That string of ‘@’ symbols will then be passed to sysbench.rand.string, which will swap out each ‘@’ for a random alphanumeric value. For example, ‘@@@@@@’ could be changed to ‘Hk9EdC’ which will then replace the ‘%s’ inside the query string (string.format) and be executed.

Handle inserts

Our INSERT statements require values. Again, sysbench calls the function execute_inserts from event on each iteration. Inside execute_inserts, we generate some fake string data using built-in functions described above.

Those strings are then formatted into the SQL and executed.

Example run

And there we have it! Custom queries specific to our application and dataset. Most of the sysbench parameters are self-explanatory, but let me talk about –report-interval=1 which shows statistics every 1 second. Normally sysbench does not output stats until the end of the run, however, the example execution will run forever (–events=0 –time=0) so we need stats to show all the time. You can adjust the parameters to your liking. For instance, if you only want to run a test for 5 minutes, set –events=0 –run-time=300.

Conclusion

Sysbench is a very well designed application that allows you to load-test your MySQL instances using pre-defined and custom queries. Using the Lua scripting language, you can create just about any scenario to fit your needs. The above is just one example that we use within Percona’s Training and Education department. It is by no means an exhaustive example of all of the capabilities of sysbench-Lua.


Photo by Lachlan Donald on Unsplash