In this short blogpost, we’ll show you how to use the mysql_random_data_load tool to insert random data into tables. This is a great aide in testing when you have empty tables and need them to be populated with data. We’ve all done it manually (the INSERT INTO … VALUES … way), but that is surely a waste of time. It can add up to a lot if you need to test tables with many columns, or even worse, with foreign keys.
Before saying anything else, we would like to mention that this tool is still under development, and that it will insert random data into tables. Be aware if running it in a production environment!
mysql_random_data_load is a tool created by PerconLabs.
Note: PerconaLabs and Percona-QA are open source GitHub repositories for unofficial scripts and tools created by Percona staff. While not covered by Percona support or services agreements, these handy utilities can help you save time and effort.
Now, back to the disclaimer-free zone, the tool in question is named mysql_random_data_load, and can be currently found in the Percona Lab GitHub repository:
https://github.com/Percona-Lab/mysql_random_data_load
The README.md file has a lot of information on it already, so feel free to pause this reading and come back after you’ve skimmed through it. If you are eager to start using it, though, we give you the crash course below.
We offer two binaries in the releases tab, one for Darwin and another one for Linux. We will show examples of the linux_amd64 build here, but they are the same for the darwin_amd64 one.
First, get the latest build (0.1.6 as of this writing):
1 2 3 4 | shell> cd ~/bin/ shell> wget https://github.com/Percona-Lab/mysql_random_data_load/releases/download/0.1.6/mysql_random_data_loader_linux_amd64.tar.gz shell> tar xzf mysql_random_data_loader_linux_amd64.tar.gz shell> chmod +x mysql_random_data_loader |
We will use the sakila database schema for the tests, so the second and final step is to download and install it.
1 2 3 | shell> wget http://downloads.mysql.com/docs/sakila-db.tar.gz shell> tar xzf sakila-db.tar.gz && rm -f sakila-db.tar.gz shell> mysql < sakila-db/sakila-schema.sql |
Ok, we are now ready to test inserting some rows with random data! Let’s try populating the actor table first. Since its primary key is a SMALLINT, we have a maximum of 65535 rows.
1 2 3 4 | shell> mysql_random_data_load sakila actor 65535 --host=127.0.0.1 --port=3306 --user=my_user --password=my_password 2017/12/22 00:17:32 Starting 4s [===============================================] 100% Total rows inserted: 65535 |
Let’s see what the data looks like:
1 2 3 4 5 6 7 8 9 10 11 12 | mysql> SELECT * FROM sakila.actor LIMIT 2G *************************** 1. row *************************** actor_id: 1 first_name: dolores nulla sunt velit placeat minima adipi last_name: quo non similique enim inventore cupiditate. last_update: 2017-01-08 13:06:44 *************************** 2. row *************************** actor_id: 2 first_name: at commodi tenetur est maiores nobis accusamu last_name: quod suscipit provident est voluptatem quis t last_update: 2017-03-07 00:05:01 2 rows in set (0.00 sec) |
Adding data to tables with Foreign Keys
Now that we have the actor table filled with data, let’s try the foreign key functionality, which is one of the tool’s biggest strengths (since coming up with data that makes sense in this context is even more laborious).
If we look at the sakila structure, a good candidate for trying this is film_actor. Apart from the actor table, it also needs the film table to have data (due to FK constraints), which in turn needs the language table to have data. Easy enough!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | shell> mysql_random_data_load sakila language 15 --host=127.0.0.1 --port=3306 --user=my_user --password=my_password 2017/12/22 00:48:30 Starting 0s [===============================================] 100% Total rows inserted: 15 shell> mysql_random_data_load sakila film 10000 --host=127.0.0.1 --port=3306 --user=my_user --password=my_password 2017/12/22 00:51:18 There are triggers on the film table that might affect this process: 2017/12/22 00:51:18 Trigger "ins_film", AFTER INSERT [...output trimmed...] 2017/12/22 00:51:18 Trigger "upd_film", AFTER UPDATE [...output trimmed...] 2017/12/22 00:51:18 Trigger "del_film", AFTER DELETE [...output trimmed...] 2017/12/22 00:51:18 Starting 1s [===============================================] 100% Total rows inserted: 10000 |
In this last example, we already used the foreign key functionality without even modifying the command we used for “regular” tables. You can disregard the notes about the triggers in this case.
We are now ready to fill the film_actor table. Let’s see two new variables: –bulk-size and –max-fk-samples. The former can help us with creating INSERTs with as many rows as we want, which helps expedite the process if many are needed. The latter chooses the number of samples from the film and actor tables, to use for the FK constraints. Note that these two variables already have defaults, so we can choose not to include them and the tool will still work.
1 2 3 4 | shell> mysql_random_data_load sakila film_actor 20000 --bulk-size=5000 --max-fk-samples=8000 --host=127.0.0.1 --port=3306 --user=my_user --password=my_password 2017/12/22 01:05:28 Starting 1s [===============================================] 100% Total rows inserted: 20000 |
A quick test shows that things went well:
1 2 3 4 5 6 7 8 9 10 11 12 13 | mysql> SELECT a.last_name, f.title, f.special_features, f.rental_rate FROM film_actor fa JOIN actor a USING (actor_id) JOIN film f USING (film_id) LIMIT 1G *************************** 1. row *************************** last_name: enim sequi recusandae dolores voluptatem! title: quia provident nemo labore architecto totam. special_features: Behind the Scenes rental_rate: 3.15 1 row in set (0.00 sec) |
Specifying queries per second
Starting from version 0.1.6, there is a new experimental feature to insert rows at a specific rate: –qps <queries per second>. This option could be used when you need not only data but also at a specific rate (for example, to test a monitoring system like PMM). Since the total number of rows to be inserted is still in effect, this option makes more sense when –bulk-size=1 is used.
Summary
As we have seen, it is really easy to insert random data into tables so you can later use them in your tests. There is still more work to do on the tool, but we have found that it is already of great help. We would love to hear back from you on it either here or, even better, via the issues GitHub page directly. If there are any use-cases or functionality you can think of, let us know!
I got the error message in below when I tried.
2018/01/12 15:42:19 Cannot set time zone to UTC: Error 1045: Access denied for user ”@’127.0.0.1′ (using password: NO)
Hi Hasan,
What exact command are you using? From the message, I believe you are not setting the password correctly (or at all?). You can check credentials by using the mysql client, and then using the same in the mysql_random_data_load tool:
shell> mysql –host= –user= –password=
A better platform for these questions is our forums: https://www.percona.com/forums/ or directly over the tool’s issues page if you think this could be a bug: https://github.com/Percona-Lab/mysql_random_data_load/issues. Let me know if you create a new thread, and we can continue over there.
Hi Agustin,
I ask the question in the forum, later. I think it is not a bug. Only just something going wrong, maybe db configuration.
Regards,
Hasan
A true feature would be to eliminate the need for this program totally. In other words, let the program output the most compact SQL to populate any given table with X rows, without needing itself.. Perfect for standalone testcases….. 😉
Thanks for the idea! I have filed a feature request for this, to add –print 🙂
https://github.com/Percona-Lab/mysql_random_data_load/issues/8
Hi all expert, I am also hit the same error message in below when I tried.
OS: Oracle Linux7.6
Mysql:mysql Ver 8.0.15-6 for Linux on x86_64 (Percona Server (GPL), Release 6, Revision 63abd08)
[root@mydb1 bin]# ./mysql_random_data_loader sakila actor 65535 –host=192.168.56.89 –port=3306 –user=abo –password=
INFO[0000] Cannot set time zone to UTC: this authentication plugin is not supported
Hello, I am running the command but it just hangs …. pls see below for the exact command am executing. Also please note that I could make a successful connection to the exact same mysql database on same host
[ec2-user@ip-10-0-0-104 bin]$ mysql –host= –port=3306 –user=root –password=root –database=sakila -e “show tables;”
+——————+
| Tables_in_sakila |
+——————+
| actor |
| category |
| country |
| language |
| t1 |
+——————+
[ec2-user@ip-10-0-0-104 bin]$ ./mysql_random_data_loader sakila actor 100 –host= –port=3306 –user=root –password=root
INFO[2019-11-07T16:28:21-05:00] Starting
— [——————————————————————–] 0%
**********************
Greatly appreciate your response and help on how to troubleshoot this hanging situation. Can we put the program in debug mode or any other way to get the log of what the error is?
Please let me know, if you would like me to provide additional information.
Could you specify the MySQL flavor and version and the program version?
I ran it and I couldn’t reproduce the issue:
./mysql_random_data_load –port 12345 sakila actor 100
INFO[2019-11-21T16:04:06-03:00] Starting
0s [====================================================================] 100%
INFO[2019-11-21T16:04:07-03:00] 100 rows inserted