Keep Sensitive Data Secure in a Replication Setup

This blog post describes how to keep sensitive data secure on slave servers in a MySQL async replication setup.

Almost every web application has a sensitive data: passwords, SNN, credit cards, emails, etc. Splitting the database to secure and “public” parts allows for restricting user and application parts access to sensitive data.

Field encryption

This is based on MySQL encryption functions or on client-side encryption when the authorized user knows a secret, but encrypted data is distributed to all slaves.

If possible, use hashes with a big enough salt, and do not store real sensitive data in the database. A good example is passwords. An end-user sends the login and password, application/SQL code calculates the hash with a salt value unique for each end-user and compares the hash with the value stored in the database. Even if the attacker gets the hashes, it’s still hard or even impossible to extract real passwords for all users. Make sure that you are using a good random number generator for the salt, application-side secret, and a good hash function (not MD5).
Encryption is not suitable if you are going to provide public access to your database (via slave dumps in sql/csv/xml/json format).
Encryption is a complex topic. Check here for a good blog post explaining hashing usage, and try to find a security consultant if you are inventing some “new” method of storing and encrypting data.

Field encryption example

I’m using a single server setup, because the most important part of data separation should be done on the application side. The secure part of the application has a secret passphrase. For example, you can place the code working with authentication, full profile and payments on a separate server and use a dedicated MySQL account.

create database encrypted;
use encrypted;
create table t(c1 int, c2 varchar(255), rnd_pad varbinary(16), primary key(c1));
SET block_encryption_mode = 'aes-256-cbc';
SET @key_str = SHA2('My secret passphrase',512);
SET @init_vector = RANDOM_BYTES(16);
insert into t (c1,c2, rnd_pad) values (1, AES_ENCRYPT('Secret', @key_str, @init_vector), @init_vector);
-- decrypt data
select c1, AES_DECRYPT(c2,@key_str, rnd_pad) from t;

create database encrypted;

use encrypted;

create table t(c1 int, c2 varchar(255), rnd_pad varbinary(16), primary key(c1));

SET block_encryption_mode = 'aes-256-cbc';

SET @key_str = SHA2('My secret passphrase',512);

SET @init_vector = RANDOM_BYTES(16);

insert into t (c1,c2, rnd_pad) values (1, AES_ENCRYPT('Secret', @key_str, @init_vector), @init_vector);

-- decrypt data

select c1, AES_DECRYPT(c2,@key_str, rnd_pad) from t;

Summary

GOOD: Master and slave servers have exactly the same data and no problems with replication.
GOOD: Even if two different end-users have exactly the same password, the stored values are different due to random bytes in the init vector for AES encryption.
GOOD: Both the encryption and random number generation uses an external library (openssl).
CONF: It’s important to have binlog_format=ROW to avoid sending the secret to slave servers.
CONF: Do not allow end-users to change data without changing the init_vector, especially for small strings without random padding. Each update should cause init_vector re-generation.
BAD: Encrypted data is still sent to slave servers. If the encryption algorithm or protocol is broken, it is possible to get access to data from an insecure part of the application.
BAD: The described protocol still could be insecure.

Replication filters

There are two types of replication filters: a master-side with binlog-*db and a slave-side with replicate-*.

Both could cause replication breakage. Replication filters were created for STATEMENT-based replication and are problematic with modern binlog_format=ROW + gtid_mode=on setup. You can find several cases related to database-level slave-side filters in this blog post. If you still need slave-side filtering, use per-table replicate-wild-*-table options.

Master-side

Even if binary logging is disabled for a specific database, the statement still could be stored in the binary log if it’s a DDL statement, or if the binlog_format is STATEMENT or MIXED and default database is not used by the statement. For details, see the reference manual for the binlog-do-db option. In order to avoid replication issues, you should use ROW-based replication and run SET SESSION sql_log_bin=0; before each DDL statement is executed against the ignored database. It’s not a good idea to use binlog-do-db, because you are losing control of what should be replicated.

Why is binary log filtering useful? Changing the sql_log_bin variable is prohibited inside transactions. The sql_log_bin is DANGEROUS, please do not use it instead of binlog-ignore-db in production on the application side. If you need it for database administration, make sure that you are always typing the “session” word before sql_log_bin. This makes problematic consistent updates of multiple entities inside database.

We still should have the ability to hide just one column from the table. But if we are ignoring the database, we should provide a method of reading non-secure data on slaves / by restricted MySQL accounts. This is possible with triggers and views:

create database test;
set session sql_log_bin=0;
create table test.t(c1 int, c2 int, primary key(c1));
alter table test.t add primary key(c1);
set session sql_log_bin=1;
create database test_insecure;
create table test_insecure.t(c1 int, c2 int default NULL, primary key(c1));

use test
delimiter //
create trigger t_aft_ins
after insert
 on test.t FOR EACH ROW
BEGIN
  INSERT test_insecure.t (c1) values (NEW.c1);
END //
create trigger t_aft_upd
after update
 on test.t FOR EACH ROW
BEGIN
  UPDATE test_insecure.t SET c1 = NEW.c1 WHERE c1 = OLD.c1;
END //
create trigger t_aft_del
after delete
 on test.t FOR EACH ROW
BEGIN
  DELETE FROM test_insecure.t WHERE c1 = OLD.c1;
END //
delimiter ;
-- just on slave:
create database test;
create view test.t as select * from test_insecure.t;

-- typical usage
INSERT INTO test.t values(1,1234);
SELECT * from test.t; -- works on both master and slave, c2 field will have NULL value on slave.

create database test;

set session sql_log_bin=0;

create table test.t(c1 int, c2 int, primary key(c1));

alter table test.t add primary key(c1);

set session sql_log_bin=1;

create database test_insecure;

create table test_insecure.t(c1 int, c2 int default NULL, primary key(c1));

use test

delimiter //

create trigger t_aft_ins

after insert

on test.t FOR EACH ROW

BEGIN

INSERT test_insecure.t (c1) values (NEW.c1);

END //

create trigger t_aft_upd

after update

on test.t FOR EACH ROW

BEGIN

UPDATE test_insecure.t SET c1 = NEW.c1 WHERE c1 = OLD.c1;

END //

create trigger t_aft_del

after delete

on test.t FOR EACH ROW

BEGIN

DELETE FROM test_insecure.t WHERE c1 = OLD.c1;

END //

delimiter ;

-- just on slave:

create database test;

create view test.t as select * from test_insecure.t;

-- typical usage

INSERT INTO test.t values(1,1234);

SELECT * from test.t; -- works on both master and slave, c2 field will have NULL value on slave.

Summary

BAD: The data is not the same on the master and slaves. It potentially breaks replication. It’s not possible to use a slave’s backup to restore the master or promote the slave as a new master.
BAD: Triggers could reduce DML statement performance.
GOOD: The sensitive data is not sent to slaves at all (and not written to binary log).
GOOD: It works with GTID
GOOD: It requires no application changes (or almost no application changes).
GOOD: binlog-ignore-db allows us to not use the dangerous sql_log_bin variable after initial table creation.

3 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Alex Skripov

6 years ago

Hi Nikolay,
Thank you for your article.
In the “field encryption example” you have a line insert into t (c1,c2, rnd_pad) values (1, AES_ENCRYPT(‘Secret’, @key_str, @init_vector), @init_vector);
AES_ENCRYPT returns a binary string containing the encrypted output. But c2 is varchar(255). I think c2 should have a different data type.

Nickolay Ihalainen

Author

Reply to Alex Skripov

6 years ago

Hi Alex,

Good point, the best storage for binary data with unknown size is varbinary(N) column type.
For example in mysql 8.0, insert will cause an error even if both client and server encodings are the same (utf8mb4):
mysql> create table t1(c varbinary(255));
Query OK, 0 rows affected (0.74 sec)

mysql> insert into t1 values(AES_ENCRYPT(‘Hi, Привет, 你好’,’secret’));
Query OK, 1 row affected (0.39 sec)

mysql> select AES_DECRYPT(‘secret’) from t1;
ERROR 1582 (42000): Incorrect parameter count in the call to native function ‘AES_DECRYPT’
mysql> select AES_DECRYPT(c,’secret’) from t1;
+————————–+
| AES_DECRYPT(c,’secret’) |
+————————–+
| Hi, Привет, 你好 |
+————————–+
1 row in set (0.00 sec)

mysql> insert into t values(AES_ENCRYPT(‘Hi, Привет, 你好’,’secret’));
ERROR 1366 (HY000): Incorrect string value: ‘\xA5\xF5\x8D\xB5o\x0F…’ for column ‘c’ at row 1

In heterogeneous character set environment the problem will be also with result set, because AES_DECRYPT returns binary and you need additional conversion step:

select cast(AES_DECRYPT(c,’secret’) AS CHAR CHARACTER SET utf8mb4) as c, @@character_set_client, @@character_set_server from t1;
+—————-+————————+————————+
| c | @@character_set_client | @@character_set_server |
+—————-+————————+————————+
| Hi, Привет, ?? | cp1251 | utf8mb4 |
+—————-+————————+————————+

Best regards,
Nickolay

Roel Van de Paar

6 years ago

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Keep Sensitive Data Secure in a Replication Setup

Field encryption

Field encryption example

Summary

Replication filters

Master-side

Summary

Related

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Valkey/Redis: Not-So-Good Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

Keep Sensitive Data Secure in a Replication Setup

Field encryption

Field encryption example

Summary

Replication filters

Master-side

Summary

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

New Valkey Packages by Percona

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Valkey/Redis: Not-So-Good Practices

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation