Let me start by saying a big “thank you” to the staff at Oracle for deciding to open source reducer.sh. It’s a tool I developed whilst I was working for them several years ago. Its sole purpose is to do one thing – but do it good: test-case simplification.
So, let’s say some customer just sent you 120,000 lines of SQL code and affirms that “it definitely causes a crash.” Or maybe you ran RQG (the Random Query Generator) for awhile (with the general query log turned on) and now you have a nice SQL trace which may just lead to that crash the run resulted in. Or you’re a DBA testing the company’s usual queries with Valgrind, and noticed that 2 in 1000 queries give a Valgrind warning in the mysqld error log – you’re just not sure which one. Or maybe you’re a developer, and during testing you saw that a SELECT query output did not look the way it should – the output was “7” where it should have been “5” – the only problem – you have 1000 lines of INSERT statements and are not sure which one caused it. In all of these cases reducer can help.
Here are some of it’s benefits/features:
- It can reduce large amount of SQL fast. 40K lines to just a few can usually be done in around 1 hour.
- Larger files scale even better – the chunking elimination method automatically adapt to file size.
- It can reduce crashes/asserts, Valgrind testcases, mysqld error log messages, and mysql CLI output testcases
- Also working (but with complex setup atm) is multi-threaded SQL test-case simplification (ALPHA)
- It can reduce sporadic testcases for all of the above (i.e. testcases where the issue does not reproduce every time)
- It can reduce sporadic testcases using multiple threads which significantly improves reduction time
- It is aware of how to establish a testcase is sporadic (and will report the same) – or not – and will change it’s behavior
- It is capable (turned on by default) of reducing actual DML/DDL query code after completing line-based reduction
- It is capable (turned on by default) of reducing testcases by eliminating columns from tables and INSERT queries
- By default reducer.sh uses tmpfs (highly recommended) to ensure testcases are “as reproducible as possible” (disk I/O)
- Additional options for mysqld (necessary to reproduce an issue) can easily be listed/added
- Regex syntax can be used in search strings (where applicable)
So, without further ado, let’s have a look at how to get it to do your simplification job
To get reducer.sh today, use these commands (yum example used, but this can easily be adapted to apt-get):
1 2 3 4 5 | $ sudo yum install bzr $ cd ~ $ bzr branch lp:randgen $ cd randgen/util/reducer/ $ ls * |
(You may also want to checkout ./status.sh in this directory which is a handy tool for seeing what reducer.sh is upto when it is doing it’s first/original attempt to reproduce a given issue.)
And you can get percona-qa (for parse_general_log.pl [and the prepare_reducer.sh code bit if you need it] as shown in the video):
1 2 3 4 | $ cd ~ $ bzr branch lp:percona-qa $ cd perconq-qa $ ls * |
Enjoy!
I upgraded reducer.sh. It now produces fully self-contained test cases with a near-100% match with the original reducer environment.
An example: after (and during) reduction of a testcase, reducer now automatically stores these files in the same directory as the original (and now reduced) testcase;
1422244738_cl # client script, if needed to access the server after starting it with _start
1422244738_init # init script, sets up data dir etc. in /dev/shm
1422244738_mybase # contains the path to the mysql base dir to be used (the only file that needs editing)
1422244738_run # run the testcase (.sql file) using the mysql CLI
1422244738_run_pquery # run the testcase using the pquery binary, included with the bundle (only if pquery was used)
1422244738.sql # the final reduced sql file
1422244738_start # start mysqld in exactly the same way as reducer was able to reduce the testcase (same mysqld options)
1422244738_stop # stop mysqld server (if necessary)
1422244738_pquery[-ms] # an included copy of the pquery used for the run (only if pquery was used)
1422244738_bug_bundle.tar.gz # A bundle of all files above
So, for example, to start this particular testcase for replay, one would execute;
$ vi 1422244738_mybase # update base directory
$ ./1422244738_init
$ ./1422244738_start
$ ./1422244738_run
This allows logging bug reports with universal _init/_start/_run scripts, which make reproducing bugs much more stable/easy.
The only change required is to alter the directory where mysqld is installed. You can do this by editing the {epochnr}_mybase file and simply copying/pasting in the mysqld base directory.
Then, you can do {epochnr}_init > _start > _run and the bug should be reproduced, with the data dir/error log/core being stored in /dev/shm/{same epochnr}/ etc.
Use _stop to stop the server and _cl to access the client. Note which reproducer (CLI; _run or pquery: _run_pquery) to use, or simply try both (5 seconds work).
We also added some things like _gdb (enters gdb prompt after using _run with the core produced by such run), _parse_core (creates nice stack trace files), and we keep improving the other scripts.
Reducer.sh is now covered in much more depth across 3 episodes in the new MySQL QA Series! See https://www.percona.com/blog/2015/07/21/mysql-qa-episode-7-single-threaded-reducer-sh-reducing-testcases-for-beginners and https://www.percona.com/blog/2015/03/17/free-mysql-qa-and-bash-linux-training-series/
All, please note we have moved percona-qa to GitHub:
https://github.com/Percona-QA/percona-qa
To clone it, use:
$ sudo yum install git
$ cd ~
$ git clone https://github.com/Percona-QA/percona-qa.git
reducer.sh was also put directly into this repository (and it is maintained there), so *no* need anymore to separately fetch lp:randgen.