A few weeks back, we faced an issue in a replication environment for a Managed Services client:
LAST_ERROR_MESSAGE: Worker 2 failed executing transaction ‘UUID:GTID’ at master binlog.0012345, end_log_pos 98765; Error ‘Operation CREATE USER failed for ‘test_user’@’10.10.10.10” on query. Default database: ‘mysql’. Query: ‘CREATE USER ‘test_user’@’10.10.10.10’ IDENTIFIED WITH ‘mysql_native_password’ AS ‘************”
After some initial investigation, we noticed that the user in the replica didn’t exist! Was MySQL going crazy? But then the customer mentioned they had the following error in the primary before being able to execute the query successfully:
1 2 | root@localhost [mysql]> CREATE USER 'test_user'@'10.10.10.10' identified WITH 'mysql_native_password' BY '************' WITH MAX_USER_CONNECTIONS 10; ERROR 1396 (HY000): Operation CREATE USER failed for 'test_user'@'10.10.10.10 |
But we knew that the first command couldn’t be replicated (since failed commands are never logged to the binlog), so the question was, what happened, and why was MySQL having trouble creating a nonexistent user?
If you do quick research, you will find a few posts offering some options:
- Execute FLUSH PRIVILEGES.
- DROP USER and then FLUSH PRIVILEGES.
- Many mentions that this is a reported bug.
And fair enough, if you try the options mentioned above, you’ll fix the issue eventually. Still, I wanted to take a deeper look at why this happened and what should be the correct approach to fix it (and understand WHY something will fix it); so I decided to do some tests and create a post for whoever faces an issue like this in the future, in the hope that you’ll too understand it.
Let’s start with a reminder: You shouldn’t manipulate the mysql grant tables manually, and if you do it, you should be sure that you know what you’re doing. So to start, here’s a summary of how MySQL’s privileges work.
A summary of MySQL privileges
At startup, MySQL reads the grant tables and loads them into memory, so whenever it needs to check if a user is allowed to read this or to connect there, MySQL can look for the privileges faster (reading from memory).
MySQL keeps track of the privileges using the grant tables, so when we issue a CREATE USER, internally, it translates to
- Insert into mysql.user
- Insert into mysql.db
If we GRANT or REVOKE, it will translate into a
- Insert into mysql.table_priv
- Update in mysql.db
- Delete in mysql.table_priv.
And so on. Please note this isn’t an exact step-by-step on what happens; it’s just an example of a few internal things MySQL needs to execute.
Finally, after each Account Management Statement, MySQL reads the new privileges into memory, and the changes are applied.
With this, MySQL makes it easy for us to manipulate privileges; should you need to delete a user, instead of manually deleting every row in each grant table, you can execute a DROP USER, and that’s it.
Even though it’s simpler that way, sometimes users decide to go the “hard way” and manipulate the privileges manually: instead of DROP USER, execute a DELETE from grant tables. But, as mentioned before, MySQL won’t know about these changes because the grant tables were read already, and the privileges are in memory.
Let’s go through the issue mentioned above
First step: The user was created in the primary server:
1 2 | root@localhost [(none)]> CREATE USER 'test_user'@'10.10.10.10' identified WITH 'mysql_native_password' BY '<redacted>'; Query OK, 0 rows affected (0.00 sec) |
Second step: After some tests, they wanted to drop the user and create it again, but they went with manually deleting the user:
1 2 3 4 5 6 7 8 | root@localhost [(none)]> use mysql Database changed root@localhost [mysql]> delete from user where user='test_user'; Query OK, 1 row affected (0.00 sec) root@localhost [mysql]> delete from db where user='test_user'; Query OK, 1 row affected (0.00 sec) |
So far, as commented above, MySQL isn’t aware that the user test_user was deleted (MySQL hasn’t re-read the grant tables into memory because they did not execute a FLUSH PRIVILEGES); that’s why when they tried to create the user again, it failed.
1 2 | root@localhost [mysql]> CREATE USER 'test_user'@'10.10.10.10' identified WITH 'mysql_native_password' BY '<redacted>' WITH MAX_USER_CONNECTIONS 10; ERROR 1396 (HY000): Operation CREATE USER failed for 'test_user'@'10.10.10.10' |
And since they couldn’t find the user, they did what we would all do: Retry again and hope that the command would magically work this time.
1 2 | root@localhost [mysql]> CREATE USER 'test_user'@'10.10.10.10' identified WITH 'mysql_native_password' BY '<redacted>'; Query OK, 0 rows affected (0.01 sec) |
And it worked, the user got created, and everything was ok in the world again. Why did it work? Because with the first CREATE USER (even though the command failed), MySQL did a reload of the grant privileges.
Here’s what happened with the first CREATE:
- MySQL checks its in-memory table and finds that the user already exists.
- Rejects the new user because it can’t duplicate the user.
- MySQL reloads the in-memory tables (This happens regardless of whether it was successful).
So, even though the command failed, MySQL reloads the in-memory tables and the user test_user doesn’t exist (since the user was deleted manually before); that’s why the second CREATE was successful.
The second CREATE USER executes successfully, it goes to the binlog and to the replica.
Now, from the replica’s point of view so far it has replicated the DELETE statements and ONE CREATE USER, and back to where it all began, the error in the replica:
LAST_ERROR_MESSAGE: Worker 2 failed executing transaction ‘UUID:GTID’ at master binlog.0012345, end_log_pos 98765; Error ‘Operation CREATE USER failed for ‘test_user’@’10.10.10.10” on query. Default database: ‘mysql’. Query: ‘CREATE USER ‘test_user’@’10.10.10.10’ IDENTIFIED WITH ‘mysql_native_password’ AS ‘************”
With the things explained above, I think it’s easier to understand what happened here: This CREATE statement is the “first” since the first one in the primary failed, so that one didn’t reach the replica, so to “fix” this we only started the replication, and that did the trick, this time the CREATE worked because it already reloaded the in-memory tables.
Before finishing, I wanted to test a few more things to understand why sometimes a simple FLUSH PRIVILEGES works and why sometimes it’s necessary to DROP USER entirely; short answer: It depends on how “well” you manually manipulated the grant tables.
When FLUSH PRIVILEGES does the trick
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | mysql> CREATE USER test123@10.10.10.10; Query OK, 0 rows affected (0.02 sec) mysql> SELECT user,host FROM mysql.user WHERE user='test123'; +------------------+-------------+ | user | host | +------------------+-------------+ | test123 | 10.10.10.10 | +------------------+-------------+ 1 row in set (0.00 sec) mysql> DELETE FROM mysql.user WHERE user='test123'; Query OK, 1 row affected (0.01 sec) mysql> FLUSH PRIVILEGES; Query OK, 0 rows affected (0.01 sec) mysql> CREATE USER test123@10.10.10.10; Query OK, 0 rows affected (0.01 sec) |
Since the create user only inserted in mysql.user, it was the only table where the user exists, so the flush privileges command was successful.
When FLUSH PRIVILEGES isn’t enough
There are many grant tables, so when manually deleting a user, it’s easy to miss one of them and lead to errors that FLUSH PRIVILEGES won’t solve.
1 2 3 4 5 | mysql> CREATE USER test123@10.10.10.10; Query OK, 0 rows affected (0.01 sec) mysql> GRANT ALL PRIVILEGES ON test_db.* TO test123@10.10.10.10; Query OK, 0 rows affected (0.00 sec) |
The GRANT command inserts a record in mysql.db table as well
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | mysql> SELECT user,host FROM mysql.user WHERE user='test123'; +---------+-------------+ | user | host | +---------+-------------+ | test123 | 10.10.10.10 | +---------+-------------+ 1 row in set (0.00 sec) mysql> SELECT user,host,db FROM mysql.db WHERE user='test123'; +---------+-------------+---------+ | user | host | db | +---------+-------------+---------+ | test123 | 10.10.10.10 | test_db | +---------+-------------+---------+ 1 row in set (0.00 sec) |
Then if I manually delete, but only from mysql.user table, the record in mysql.db will be there until it’s cleaned.
1 2 3 4 5 6 7 8 9 10 11 12 13 | mysql> DELETE FROM mysql.user WHERE user='test123'; Query OK, 1 row affected (0.01 sec) mysql> SELECT user,host FROM mysql.user WHERE user='test123'; Empty set (0.00 sec) mysql> SELECT user,host,db FROM mysql.db WHERE user='test123'; +---------+-------------+---------+ | user | host | db | +---------+-------------+---------+ | test123 | 10.10.10.10 | test_db | +---------+-------------+---------+ 1 row in set (0.00 sec) |
So even if I try with the FLUSH command, I won’t be able to create the user (due to the user existing in mysql.db)
1 2 3 4 5 6 7 8 9 10 11 | mysql> FLUSH PRIVILEGES; Query OK, 0 rows affected (0.01 sec) mysql> CREATE USER test123@10.10.10.10; ERROR 1396 (HY000): Operation CREATE USER failed for 'test123'@'10.10.10.10' mysql> CREATE USER test123@10.10.10.10; ERROR 1396 (HY000): Operation CREATE USER failed for 'test123'@'10.10.10.10' mysql> CREATE USER test123@10.10.10.10; ERROR 1396 (HY000): Operation CREATE USER failed for 'test123'@'10.10.10.10' |
The proper way to do this is by executing the DROP command and letting MySQL clean whatever tables are required (You can see below that the mysql.db table was cleaned as well).
1 2 3 4 5 | mysql> DROP USER test123@10.10.10.10; Query OK, 0 rows affected (0.01 sec) mysql> SELECT user,host,db FROM mysql.db WHERE user='test123'; Empty set (0.00 sec) |
And now the CREATE works:
1 2 | mysql> CREATE USER test123@10.10.10.10; Query OK, 0 rows affected (0.00 sec) |
What about “IF NOT EXISTS” clause?
Using the same scenario with replication; first, create the user in the primary.
1 2 3 4 5 6 7 8 9 10 | mysql> CREATE USER 'test_user'@'10.10.10.10' identified WITH 'mysql_native_password' BY '<redacted>'; Query OK, 0 rows affected (0.00 sec) mysql> SELECT user,host FROM mysql.user WHERE user='test_user'; +-----------+-------------+ | user | host | +-----------+-------------+ | test_user | 10.10.10.10 | +-----------+-------------+ 1 row in set (0.00 sec) |
After this, manually delete the user from the mysql.user table.
1 2 3 4 5 | mysql> DELETE FROM mysql.user WHERE user='test_user'; Query OK, 1 row affected (0.00 sec) mysql> SELECT user,host FROM mysql.user WHERE user='test_user'; Empty set (0.00 sec) |
We know that a CREATE USER command would fail and reload the in-memory tables; let’s see what happens if we add the IF NOT EXISTS clause:
1 2 | mysql> CREATE USER IF NOT EXISTS 'test_user'@'10.10.10.10' identified WITH 'mysql_native_password' BY '<redacted>'; Query OK, 0 rows affected, 1 warning (0.00 sec) |
No errors, but one warning:
1 2 3 4 5 6 7 | mysql> SHOW WARNINGS; +-------+------+------------------------------------------------------------+ | Level | Code | Message | +-------+------+------------------------------------------------------------+ | Note | 3163 | Authorization ID 'test_user'@'10.10.10.10' already exists. | +-------+------+------------------------------------------------------------+ 1 row in set (0.00 sec) |
And the user doesn’t exist, as expected (It was manually deleted, but without FLUSH PRIVILEGES so MySQL is not aware).
1 2 | mysql> SELECT user,host FROM mysql.user WHERE user='test_user'; Empty set (0.00 sec) |
But, this time the command was logged in the binlog:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | mysql> SHOW BINLOG EVENTS IN 'mysql-bin.000001' FROM 2395 LIMIT 2\G *************************** 1. row *************************** Log_name: mysql-bin.000001 Pos: 2395 Event_type: Gtid Server_id: 1 End_log_pos: 2474 Info: SET @@SESSION.GTID_NEXT= '83988545-3051-11ed-b2af-0a3d309b4fdf:8' *************************** 2. row *************************** Log_name: mysql-bin.000001 Pos: 2474 Event_type: Query Server_id: 1 End_log_pos: 2712 Info: use `mysql`; CREATE USER IF NOT EXISTS 'test_user'@'10.10.10.10' IDENTIFIED WITH 'mysql_native_password' AS '<redacted>' /* xid=68 */ 2 rows in set (0.00 sec) |
And was replicated without issues as well, below are the relay log events:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | mysql> SHOW RELAYLOG EVENTS IN 'relay-log-server.000002' FROM 2611\G *************************** 1. row *************************** Log_name: relay-log-server.000002 Pos: 2611 Event_type: Gtid Server_id: 1 End_log_pos: 2474 Info: SET @@SESSION.GTID_NEXT= '83988545-3051-11ed-b2af-0a3d309b4fdf:8' *************************** 2. row *************************** Log_name: relay-log-server.000002 Pos: 2690 Event_type: Query Server_id: 1 End_log_pos: 2712 Info: use `mysql`; CREATE USER IF NOT EXISTS 'test_user'@'10.10.10.10' IDENTIFIED WITH 'mysql_native_password' AS '<redacted>' /* xid=68 */ |
Last thing to keep in mind, this time the Account Management Statement didn’t reload the grant tables after execution (We can know this because the second execution didn’t create the user, as happens when the CREATE USER is executed without the clause, see the warnings below):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | mysql> CREATE USER IF NOT EXISTS 'test_user'@'10.10.10.10' identified WITH 'mysql_native_password' BY '<redacted>'; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> CREATE USER IF NOT EXISTS 'test_user'@'10.10.10.10' identified WITH 'mysql_native_password' BY '<redacted>'; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> CREATE USER IF NOT EXISTS 'test_user'@'10.10.10.10' identified WITH 'mysql_native_password' BY '<redacted>'; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> SELECT user,host FROM mysql.user WHERE user='test_user'; Empty set (0.00 sec) mysql> FLUSH PRIVILEGES; Query OK, 0 rows affected (0.00 sec) mysql> CREATE USER IF NOT EXISTS 'test_user'@'10.10.10.10' identified WITH 'mysql_native_password' BY '<redacted>'; Query OK, 0 rows affected (0.01 sec) mysql> SELECT user,host FROM mysql.user WHERE user='test_user'; +-----------+-------------+ | user | host | +-----------+-------------+ | test_user | 10.10.10.10 | +-----------+-------------+ 1 row in set (0.00 sec) |
So, the IF NOT EXISTS clause can save you from the error (and no issues in a replication environment as well), but as always, be sure to check the warnings to understand what’s happening, otherwise, you’ll still don’t have the user required.
Final thoughts
You should definitely use the Account Management Statements when you want to modify privileges/users in MySQL, so you don’t have to worry about all the details.
I wanted to share this particular scenario with you in the hopes that this will make clearer how MySQL manages internal privileges and that the in-memory tables are a critical thing to have in mind should you modify anything manually.
And finally, remember that the docs are your friend:
If you modify the grant tables directly using statements such as INSERT, UPDATE, or DELETE (which is not recommended), the changes have no effect on privilege checking until you either tell the server to reload the tables or restart it. Thus, if you change the grant tables directly but forget to reload them, the changes have no effect until you restart the server. This may leave you wondering why your changes seem to make no difference!
Great post!
Small question,
Return DELETE statement on mysql.user and also DROP USER:
Why?
Mauricio, very useful article!
Added to my favorites
Great article, Mauricio!
i got same error in mysql 8.0 after I run ALTER root@localhost, i saw error: Error ‘Operation ALTER USER failed for ‘root’@’localhost” on query. Default database: ‘mysql’. Query: ‘ALTER USER ‘root’@’localhost’ IDENTIFIED WITH ‘mysql_native_password’ AS
is anyway can fix this issue?