Exploring How MySQL 5.7 Multi-Source Replication Works

Recently Oracle announced several new features for the latest available development version of MySQL that is 5.7.2 at the time of writing this article. Most of them are performance and replication related that show us how incredible the new release will be.

In this post I’m going to try to explain in some easy steps how the new multi-source replication works and how we can configure it for our own tests. It is important to mention that this is a development release, so it is not production ready. Therefore this post is intend to people that want to test the new feature and see how it works with their application, always in a staging environment.

What is multi-source replication?

First, we need to have clear that multi-master and multi-source replication are not the same. Multi-Master replication is the usual circular replication where you can write on any server and data gets replicated to all others.

Multi-source is different. MySQL replication had a limitation, fixed with this new release, that said that one slave could have only one master. That is a limiting factor when we are designing our replication environment. There were some “hacks” to make it work, but now there is an official way. So, in a nutshell, Multi-Source means that a slave can have more than one master. Now, replication environments like these one are possible:

This will help us create some replication hierarchies that were impossible in the past. For example, you can have a slave in your office replicating the data from all servers that you have in the offices spread around the world.

How does it work?

Now we have the concept of communication channels. Each communication channel is a connection from slave to master to get the binary log events. That means we will have one IO_THREAD for each communication channel. We will need to run different “CHANGE MASTER” commands, one for each master, with the “FOR CHANNEL” argument that we will use to give a name to that channel.

CHANGE MASTER MASTER_HOST='something', MASTER_USER=... FOR CHANNEL="name_of_channel";

1	CHANGE MASTER MASTER_HOST='something', MASTER_USER=... FOR CHANNEL="name_of_channel";

Pretty easy. There is one single pre-requisite. The slave should have been configured first with the crash-safe feature of MySQL 5.6. That means that info usually included in master.info or relay-log.info should be on a table. Let’s start with the configuration.

Show me an example!

First you need to download the lab version of mysql from this link.

We have a sandbox environment with 2 masters and 1 slave servers. I won’t go over the details of how to configure the server_id, binary logs or replication users. I assume they are well configured. If you need a howto, you can follow this one.

First, we have to enable the crash safe feature on the slave:

master_info_repository=TABLE;
relay_log_info_repository=TABLE;

1 2	master_info_repository=TABLE; relay_log_info_repository=TABLE;

After a restart of the slave we can start creating the channels with the names “master1” and “master2”:

slave > change master to master_host="127.0.0.1", master_port=12047, master_user="msandbox",master_password="msandbox" for channel="master1";
slave > change master to master_host="127.0.0.1", master_port=12048, master_user="msandbox",master_password="msandbox" for channel="master2";

1 2	slave > change master to master_host="127.0.0.1", master_port=12047, master_user="msandbox",master_password="msandbox" for channel="master1"; slave > change master to master_host="127.0.0.1", master_port=12048, master_user="msandbox",master_password="msandbox" for channel="master2";

To start the slave processes you need to specify what channel are you referring to:

slave > start slave for channel="master1";
slave > start slave for channel="master2";

1 2	slave > start slave for channel="master1"; slave > start slave for channel="master2";

Now, we want to check the status of the slave:

slave > show slave statusG
Empty set (0.00 sec)

1 2	slave > show slave statusG Empty set (0.00 sec)

Oh, it is empty. We have to specify again which channel we want to check:

slave > SHOW SLAVE STATUS FOR CHANNEL="master1"G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: msandbox
                  Master_Port: 12047
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 232
               Relay_Log_File: squeeze-relay-bin-master1.000003
                Relay_Log_Pos: 395
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
[...]

slave > SHOW SLAVE STATUS FOR CHANNEL="master1"G

*************************** 1. row ***************************

Slave_IO_State: Waiting for master to send event

Master_Host: 127.0.0.1

Master_User: msandbox

Master_Port: 12047

Connect_Retry: 60

Master_Log_File: mysql-bin.000002

Read_Master_Log_Pos: 232

Relay_Log_File: squeeze-relay-bin-master1.000003

Relay_Log_Pos: 395

Relay_Master_Log_File: mysql-bin.000002

Slave_IO_Running: Yes

Slave_SQL_Running: Yes

[...]

and we can also check that the IO_THREAD and SQL_THREADS are running:

slave > SHOW PROCESSLIST;
+----+-------------+-----------------------------------------------------------------------------+
| Id | User        | State                                                                       |
+----+-------------+-----------------------------------------------------------------------------+
|  2 | system user | Waiting for master to send event                                            |
|  3 | system user | Slave has read all relay log; waiting for the slave I/O thread to update it |
|  4 | system user | Waiting for master to send event                                            |
|  5 | system user | Slave has read all relay log; waiting for the slave I/O thread to update it |
+----+-------------+-----------------------------------------------------------------------------+

slave > SHOW PROCESSLIST;

+----+-------------+-----------------------------------------------------------------------------+

| Id | User | State |

+----+-------------+-----------------------------------------------------------------------------+

| 2 | system user | Waiting for master to send event |

| 3 | system user | Slave has read all relay log; waiting for the slave I/O thread to update it |

| 4 | system user | Waiting for master to send event |

| 5 | system user | Slave has read all relay log; waiting for the slave I/O thread to update it |

+----+-------------+-----------------------------------------------------------------------------+

Let’s test it:

master1 > create database master1;
master2 > create database master2;
slave > show databases like 'master%';
+--------------------+
| Database (master%) |
+--------------------+
| master1            |
| master2            |
+--------------------+

master1 > create database master1;

master2 > create database master2;

slave > show databases like 'master%';

+--------------------+

| Database (master%) |

+--------------------+

| master1 |

| master2 |

+--------------------+

It works, that easy!

Conclusion

The new multi-source feature allow us to create new replication environments that were impossible in the past without some complex “hacks”. Of course, your application should be designed and developed with this new architecture in mind. Like with multi-master, multi-source needs special care to not end up with your data messed up.

MySQL replication is getting better on every release giving us more configuration, performance and design possibilities. And all those new features can be combined. Your replication environment can be even better if you mix some of the new (and old) features added recently to the replication. For example, you can configure GTID or enable multi-threaded slave per schema.

54 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Marc Castrovinci

10 years ago

Great feature! It looks helpful for when you want to funnel sharded data into a single location so you can run OLAP type operations over all the data.

Nick

Reply to Marc Castrovinci

8 years ago

Marc? As in SOLN, Marc Castrovinci?

Jonatas

10 years ago

Very nice feature. Keeping eye on 5.7…

shiv

10 years ago

Hi Miguel,
You can use the new replication performance schema tables to look at all the channels or choose which channel you want…

Miguel Angel Nieto

Author

10 years ago

shiv, yes, it is possible to get that info from P_S tables:

replication_connection_configuration
replication_execute_configuration

but… in my personal opinion that should be in I_S. That info is not performance related, it just show us configuration information so it should be available via SHOW (for example SHOW SLAVE STATUS CHANNEL ALL) or via I_S tables. Maybe I’m wrong, but I’ve opened a feature request anyway:

http://bugs.mysql.com/bug.php?id=70434

Tariq

10 years ago

I trying to imagine the practical implementation of this new ‘multi source replication’ and how does it really help having a slave getting replicated from multiple sources when the data_set_from_every_master is the same ?

Consider the M-M replication master1 and master2 having the dataset X and I could easily run a slave1 either from Master1 or Master2 and possibly have many other slaves from slave1 or master[1-2] and so on, why would one need to replicate from multiple masters if data is all the same unless for some arcane reason i want to replicate a particular db from master 1 and another db2 from another master which is not necessarily in the M-M replication chain .

Miguel Angel Nieto

Author

10 years ago

Tariq, the data from masters don’t need to be the same. The masters on the second graph they are not in M-M replication. There are lot of examples where multi-source can be applied:

– maybe you have hundred of apache servers, logging accesses to a mysql. in that case, having all those mysql replicating to a single slave could help you to run queries there.

– or you have radar-controlled speed cameras all around the country and you need to have one single point to query the data.

– or just the example made by Marc on the first comment.

So, masters don’t need to have the same data set. Now practical implementation depends on what you need not on MySQL limitations.

James Attard

10 years ago

I will definitely test and blog this setup. Will make downtime-less migrations easier, without any doubt.

Hugo

10 years ago

Nice Nice.

This will drastically reduce the complexity on our OLAP servers.

Does this works nice if the salve uses BlackHole Engine for all the replicated databases?

Miguel Angel Nieto

Author

10 years ago

Hugo, this still works like regular replication, just with more IO and SQL threads. No limitation on what storage engine you use.

Terence Monteiro

10 years ago

Hi Miguel,

For the multi-source replication, I would like to know if it is sufficient to upgrade the slave to 5.7.2, or do the masters also need to run the same mysql version?

Miguel Angel Nieto

Author

10 years ago

Terence, didn’t have time to check it. But the theory says “it should”. It just opens more “io threads”, so for the master server is like having two different slaves. But as I said, didn’t check it. If I get some spare time I will try it.

Terence Monteiro

10 years ago

Hi Miguel,

Thanks for your reply. I just got multi-source replication working with MySQL 5.7.2 on my laptop with 2 vms running 5..

Terence Monteiro

10 years ago

I got multi-source replication running with 2 VMs with different MySQL versions (5.1.49 and 5.1.63). Thanks.

Marko

10 years ago

Hi all,
I’ve just slave it in m-m with two percona’s 5.6 and it works as expected. Everything is replicated everywhere. I noticed. after 5.7.2 is restarted the replications on it can’t be started again and have to be setup again.

rucypli

10 years ago

is percona plans to add feathure Parallel Slave like MariaDB？ thanks

Mike Siekkinen

10 years ago

During my (still labs-version 5.7.2) testing of multi source replication, it seems there is a bug where replication does not automatically pick back up after a server restart.

Information remains in the slave_master_info table, how ever attempt to start slave for channel or show slave status for channel returns an error as if it had never been setup.

Rerunning a full change master to command from the log positions left in slave_master_info table seemed to allow replication to resume.

Aside from that one gotcha I ran into was trying to replicate a table with the now-reserved word “channel” for a column name. The application wasn’t encasing the column names in backticks to replication would break on a syntax error for those inserts.

Mike Siekkinen

10 years ago

Further testing off a real replication stream broke with an error about not being able to parse log events. This was trying to read binlogs from a percona 5.5.34 build. Between mixing percona builds with mysql labs builds and a big “TESTING ONLY” warning on the download I’ll wait for it to become more stable before pursuing further.

Miguel Angel Nieto

Author

10 years ago

Mike, thanks for your comments. That is what Oracle wants with Lab version, to have people breaking it like you are doing. It would be good if you could report those bugs on http://bugs.mysql.com/ That will help to make the final release even better 🙂

Wayne Gaylard

10 years ago

I am trying to do some testing but I keep getting syntax errors when trying to do CHANGE MASTER TO …. FOR CHANNEL = ‘channel_name’. Any ideas. Using 5.7.3, tried in Workbench and on command line – same result.

s capkan

10 years ago

I’m wondering how mysql does 5.7 handle auto_increment feature. What if i have two master without any communication and both of them has auto increment row, they can easily get same ids. Should i also use auto_increment_increment and auto_increment_offset parameters like master-master structure?.

Mike Siekkinen

10 years ago

@wayne, you have to download the laps specific version from http://labs.mysql.com/ Select multi source replication from the drop down.

Mike Siekkinen

10 years ago

*labs; I wish I could edit comments here

Mike Siekkinen

10 years ago

@s capkan, I believe the point of multi source replication is to get the databases from server A and server B into the same instance on the slave C instance. If both A and B hold the same database/table schemas I imagine you’re going to have a bad time.

The auto_increment_offest is meant for multi master replication so they don’t step on each other since mysql replication is not synchronous.

Alex

10 years ago

We have many MySQL instances that are set up with the same set of databases for each instance. I was planning to use multi-source to create a single backup point and reporting point. The problem is that all the source instances have the same databases. Are there any considerations for a prefix on the database name, or something that creates a unique database name on the replica in such a scenario?

Moody

10 years ago

I have the same concern as the post above me, Alex. If Master1 and Master2 both have a schema called “test”, how does this conflict with multi-source replication? I would imagine there would be 2 problems, the first one being can’t have 2 schemas with the same name. Second, how does the slave know which schema to apply the slave event?

alex

9 years ago

have you checked if mysql 5.7.4 support this feature? when i issue this statement “change master to master_host=”127.0.0.1″, master_port=12047, master_user=”msandbox”,master_password=”msandbox” for channel=”master1″; , mysql reports an syntax error near ‘ for channel’. i havent found any document for ‘for channel’ clause. could you please give me some help ? thanks .

Alex

9 years ago

@alex: It is not in the release builds yet, you have to download the multi-source replication version from http://labs.mysql.com/

Lilian

9 years ago

Hello,

Could you help me with the installation of MySQL 5.7.2, because I am unable to install on my Slave

Tom Diederich

9 years ago

Hi Lilian,

I suggest asking this on our MySQL discussion forums: https://www.percona.com/forums/

Kiran

9 years ago

Hi All,

Does this gives HA facilit. For e.g., If we assume M1,M2,M3 are three multisource masters and S1 is the Slave server.if one among the Master server fails. Let us say M2 failed. Can we failover to Slave S1 and is it possible to continuously write to new Master?

Any real time example or Use case for this to get a clear idea ?

Best Regards,
Kiran.M.K.

Shruti

Reply to Kiran

7 years ago

Did you get an answer for this?

Kiran

9 years ago

Is the GTID mandatory in Multi source Replication ?

Best Regards,
Kiran.M.K.

Andrew Griffin

9 years ago

Is this a reasonable solution for a fallback/backup server for disparate master instances? We have two different instances running completely unrelated data in unique DBs. I’m looking to put together a single fallback source for maintenance/failures and I’m still not perfectly sure this is a good option, though it’s really attractive. should I just go with 1:1 instances instead?

Vasu

9 years ago

It has been more than a year since the original announcement, and multi-source replication is still available only in the lab version not intended for production use. Is there any way to find out if and when this feature will be available in a production release of MySQL?

dirham

9 years ago

can you show me step by step create 2 master for one slave ?

server A = Master
server B = Master
Server C = slave

cen you help me ?

Rabindra Thapa

Reply to dirham

7 years ago

contact me i will teach

Rabindra Thapa

Reply to Rabindra Thapa

7 years ago

[email protected]

breddy

9 years ago

How does this multi-source replication configuration works, if all masters (masters only) are in circular replication setup and slave try to pull data from all masters?

Adi webster

8 years ago

Hi,

In mysql 5.7 multiple source replication, do masters also need to run on same mysql version?
For clarification. I have 3 master having mysql version 5.6 and slave with 5.7 will multiple master and single slave replication will work?

Nick

Reply to Adi webster

8 years ago

did you find out about this? I have the same setup.

Gabriel Wang

8 years ago

Hi,
I’m building a multi-source replication on mysql 5.7.7-rc. The scenario in my side is there are multiple mysql instances, all of them has the same table structure, but with different data, all I need to do is to merge all the data into 1 replica mysql instance.
Therefore I planned to add 1 column in replica database tables named master-name, next step is to write a “before update/insert” trigger to update the replica table when every time insert/update come. BUT, I found it’s not easy to identify the source of the insert/update.
Some findings from my side:
1. I can find replication channel information from table “performance_schema.replication_connection_configuration”.
2. Also there is process information on information_schema.PROCESSLIST.

Which mysql master instance does the insert/update comes from?
Thanks in advance!

Jürgen Friedrich

8 years ago

@Dirham follow https://www.digitalocean.com/community/tutorials/how-to-set-up-master-slave-replication-in-mysql and http://dev.mysql.com/doc/refman/5.7/en/replication-multi-source-tutorials.html helped me a lot

actually, just add “FOR CHANNEL ‘master-XX’ for each Master on slave and it will work!

Marieke Janssen

8 years ago

it this feature still only in the labs version? I’ve installed 5.7.7-rc and get syntax error

Nick

8 years ago

Before setting up a test environment, figured I’d ask before wasting the time. Anyone know if having a mysql master running 5.6 and having slaves at version 5.7 would work? Or would we need to upgrade the masters to 5.7 also?

Nick

Reply to Nick

8 years ago

Guess I can answer my own question. Yes it works using 5.6 master and 5.7 slaves making use of the CHANNELs

Dhanaekaran

8 years ago

Hi Guys,

Have tried MySQL5.7.10, syntax changed little bit.

It’s works for me.

>change master to master_host=”172.16.60.8″,master_port=3306, master_log_file = ‘bin.000011’, master_log_pos= 403, master_user=”replicator”, master_password=”pass” for channel “master1”;

Agastya

8 years ago

nice article.. I am using Xampp and I successfully replicated only for one master and one slave…I want to replicate multiple master and one slave..can you explain how to do this in Xampp?.Thanks

Jason Becker

8 years ago

I don’t know if this changed from the time this article was written or what but with Percona-Server-server-57-5.7.11-4.1.el6.x86_64 it is not:

… for channel=’master1′;

It is:

… for channel ‘master1’;

See the MySQL docs: https://dev.mysql.com/doc/refman/5.7/en/change-master-to.html

But even the MySQL docs omit the quotes around the channel name.

Hope this helps someone.

Ruichao Lin

7 years ago

how to skip one transaction in one channel ?

Dominik Klein

Reply to Ruichao Lin

7 years ago

# stop all slaves
stop slave;

# set skip counter
set global sql_slave_skip_counter=1;

# start slave that shall skip one entry
start slave for channel “$channel”;

# start all other slaves
start slave;

Mahesh Patil

7 years ago

Hello Miguel,

I am not sure whether multi source replication in Percona server v5.7.13 supports replicate-rewrite-db option for multiple channels. Similarly as explained here – https://mariadb.com/blog/multisource-replication-how-resolve-schema-name-conflicts

Can you please share your thoughts on this. Any blog which can give more insights on this would be helpful.

Nick

6 years ago

Fix or delete the link at the end of the article. intra-database it links to malware.

Dave Avery

Reply to Nick

6 years ago

Nick, thanks for your comment. We have deleted the link.

MySQL 5.7
End of Life

Compare Percona to Leading Database Solutions

Software
Downloads

Product
Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

MySQL 5.7 multi-source replication

What is multi-source replication?

How does it work?

Show me an example!

Conclusion

Related

Related Blog Articles

RECOMMENDED ARTICLES

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Seamless Table Modifications: Leveraging pt-online-schema-change for Online Alterations

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7 End of Life

Compare Percona to Leading Database Solutions

Software Downloads

Product Documentation

Resource Hub

Financial Services

Driving Database Success

Percona Blog

Percona Community Hub

Percona Events Hub

About Percona

Percona in the News

Our Customers

Our Partners

Careers

Contact Us

MySQL 5.7 multi-source replication

What is multi-source replication?

How does it work?

Show me an example!

Conclusion

Related

Share This Post!

Want to get weekly updates listing the latest blog posts?

Related Blog Articles

RECOMMENDED ARTICLES

Can We Set up a Replicate Filter Within the Percona XtraDB Cluster?

Choosing the Right Database: Comparing MariaDB vs. MySQL, PostgreSQL, and MongoDB

Seamless Table Modifications: Leveraging pt-online-schema-change for Online Alterations

MOST POPULAR ARTICLES

Auditing login attempts in MySQL

Deploy Django on Kubernetes With Percona Operator for PostgreSQL

MySQL “Got an error reading communication packet”

MySQL 5.7
End of Life

Software
Downloads

Product
Documentation