Comments on: Orchestrator: MySQL Replication Topology Manager https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/ Fri, 02 Feb 2024 23:49:31 +0000 hourly 1 https://wordpress.org/?v=6.5.2 By: miczard https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10969806 Fri, 12 Oct 2018 07:00:45 +0000 https://www.percona.com/blog/?p=33848#comment-10969806 @shlomi and Team

I just wondering if we can switch from Master (not the intermediate master) into slave and vice versa ?
thanks

]]>
By: Shlomi Noach https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10968452 Thu, 14 Sep 2017 08:39:00 +0000 https://www.percona.com/blog/?p=33848#comment-10968452 Given this blog post is popular, and given people ask me questions based on the contents of this post, I wish to list here some updates:

– orchestrator supports planned failovers, via:

orchestrator -c graceful-master-takeover -alias mycluster # Gracefully discard master and promote another (direct child) instance instead, even if everything is running well

This will make replica catch up to master, promote replica, attempt to set up old master below new master (reverse replication).

– orchestrator supports user-initiated non-graceful (panic) failovers. So a user can kick a failover even if orchestrator is confused, or doesn’t see the problem, or is configured to not failover a specific cluster, etc. For master failover:

orchestrator -c force-master-failover -alias mycluster

This promotes a replica without waiting for it to catch up ; just kick the failover process.

orchestrator/raft is now released, and provides HA without MySQL backend HA. In fact, you can run orchestrator/raft on SQLite if you wish. The raft consensus protocol ensures leadership over quorum.

]]>
By: Shlomi Noach https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10968349 Tue, 08 Aug 2017 07:20:45 +0000 https://www.percona.com/blog/?p=33848#comment-10968349 @Matthew SQLite is now supported in the orchestrator/raft setup: http://code.openark.org/blog/mysql/orchestratorraft-pre-release-3-0

@Marko unfortunately not at this stage. Bringing up VIPs is too much ties to your specific infrastructure. For changing proxy configuration I suggest consul with consul-template (and in the future I’ll provide working examples of integrating orchestrator & consul).

]]>
By: Marko https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10968348 Tue, 08 Aug 2017 06:32:17 +0000 https://www.percona.com/blog/?p=33848#comment-10968348 Are there any working examples of scripts which should bring up VIPs or change proxy configuration?

]]>
By: Shlomi Noach https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10967458 Thu, 15 Dec 2016 11:21:34 +0000 https://www.percona.com/blog/?p=33848#comment-10967458 @zdt I’m uncertain what you mean; to clarify, we run multiple orchestrator services in an HA setup, coordinating operations, managing many databases (the largest known setup to me runs thousands of databases).

Past contributions included Vagrant config (https://github.com/github/orchestrator/blob/master/Vagrantfile) ; recent contributions include Docker config.

I’m happy if you can further clarify what it is you wish to achieve, and preferably you can open an issue on https://github.com/github/orchestrator/issues?

You are of course free to do anything you like under the terms of the license (Apache 2)

]]>
By: zdt https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10967457 Thu, 15 Dec 2016 08:50:03 +0000 https://www.percona.com/blog/?p=33848#comment-10967457 I think Orchestrator is kind of limited for it’s a go application that run on one database on one physical machine.

For DBaas services, use drag & drop is user friendly, may I try to extract your front end code to integregate with openstack?

]]>
By: Yuval Menchik https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10966707 Tue, 12 Jul 2016 12:17:35 +0000 https://www.percona.com/blog/?p=33848#comment-10966707 Thanks for your reply Daniel and Shlomi.We will probably first migrate to 5.7 without using “multi-source replication” feature. When we will decide to use the feature we might make some changes in order to keep using the Orchestrator.
any way we will let you know.

]]>
By: Shlomi Noach https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10966701 Mon, 11 Jul 2016 07:12:47 +0000 https://www.percona.com/blog/?p=33848#comment-10966701 @Yuval,

^ what Daniel said; also, for me multi-source is not on the roadmap; I’ve discussed with Daniel & friends the implications for supporting multi-source. Some aspects would be easy wins, others would require quite the change to the codebase.

]]>
By: Daniël van Eeden https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10966700 Mon, 11 Jul 2016 06:42:17 +0000 https://www.percona.com/blog/?p=33848#comment-10966700 We do use Orchestrator with MySQL 5.7. I had a brief look into what’s needed to support multi-source in Orchestrator and that needs quite a bit of work, but should be doable.

]]>
By: Yuval Menchik https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10966699 Mon, 11 Jul 2016 06:13:44 +0000 https://www.percona.com/blog/?p=33848#comment-10966699 @shlomi – does the Orchestrator will work with MYSQL 5.7 in case we won’t use “multi-source replication” feature?
Is there a plan to support “multi-source replication” feature?

]]>
By: Shlomi Noach https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10966026 Fri, 25 Mar 2016 06:01:34 +0000 https://www.percona.com/blog/?p=33848#comment-10966026 @manjotsinghpercona, I do not argue this could be useful & valuable. However it implies an agent-based solution as opposed to agent-less solution. This is of course doable, and there is, in fact, an orchestrator-agent. It’s a different complexity.

BTW another solution for you is to only put servers with log_slave_updates on 1st replication tier, and have all the result below them. This ensures you do not lose replicas because orchestrator can rearrange the promotes servers anyway it likes.

]]>
By: manjotsinghpercona https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10966025 Thu, 24 Mar 2016 21:25:21 +0000 https://www.percona.com/blog/?p=33848#comment-10966025 @Shlomi

> Does the Orchestrator get the old master’s Binlog and merge slave’s Binlog.

I think this would be valuable, especially in environments where there is not the ability or will to go semi-sync

]]>
By: Daniël van Eeden https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10965989 Sat, 12 Mar 2016 14:58:06 +0000 https://www.percona.com/blog/?p=33848#comment-10965989 Shlomi, there are some upsert options in SQLite. https://stackoverflow.com/questions/418898/sqlite-upsert-not-insert-or-replace

]]>
By: Shlomi Noach https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10965981 Thu, 10 Mar 2016 11:35:44 +0000 https://www.percona.com/blog/?p=33848#comment-10965981 @Matthew:
– remove SPOF how? Via ActorDB? rqlite? I’m unfamiliar with these enough. Are they stable?
Is there a way for multiple orchestrator services to communicate to a single SQLite server, or otherwise to reliably read/write to a consensus-based, synchronous group of SQLite databases?

– Otherwise there’s mostly the small-ish-and-yet-blocking limitation of no INSERT...ON DUPLICATE KEY UPDATE in SQLite. At least my limited knowledge of SQLite shows this. Possibly there’s an easy way around this.

]]>
By: Matthew Boehm https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10965977 Wed, 09 Mar 2016 15:26:41 +0000 https://www.percona.com/blog/?p=33848#comment-10965977 @shlomi Could you not use SQLite to handle the backend stuff? This could reduce setup complexity and remove the SPOF, no?

]]>
By: Shlomi Noach https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10965974 Wed, 09 Mar 2016 09:41:58 +0000 https://www.percona.com/blog/?p=33848#comment-10965974 @nnn

> Does the Orchestrator get the old master’s Binlog and merge slave’s Binlog.

It does not. It runs agent-free thus does not merge master binlog entries that were not fetched by replicas. My orientation is to not pursue this, as semi-sync is an available mechanism that fills this need. A nice observations is the the original author of MHA uses (a variant of) semi sync for ensuring lossless failover at Facebook. Orchestrator is not aiming in particular at lossless failover, but with 5.7 semisync you should get just that.

There is expected work on semi-sync support in orchestrator. A contributor is working on that at this time.

]]>
By: Shlomi Noach https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10965973 Wed, 09 Mar 2016 09:35:15 +0000 https://www.percona.com/blog/?p=33848#comment-10965973 Hi! orchestrator dev and maintainer here. Thank you for researching orchestrator and for this writeup! A few comments:

> At this moment, it’s required to have a MySQL backend and there is no clear/tested support for having this in high availability (HA) as well. This might change in the future.

You have a few options. To begin with, please note that the orchestrator backend database is likely to be very small, less than a 1GB in size for most installations. Binary logs are going to consume more than the data.
– Setup Galera / XtraDB as backend
– Use NDB Cluster as backend (right now orchestrator strictly created tables as InnoDB)
I have not experimented with either the above options
– I’m working to support (yet to be published & documented) active-active master-master statement based replication. At this time orchestrator is known to not cause collisions and gracefully overwrite and converge data.
Right now I’m already using active-active (writable-writable) setup under HAProxy with first load balancing algorithm. The remaining concern right now is the case of a backend database failover at the time of performing a crash recovery on another cluster, and I’m working to make that failover graceful as well. Semi-sync is likely to play a role here.
Anything failing not-while-performing-failover is good to go.

> One example of what Orchestrator can do is promote a slave if a master is down. It will choose the most up to date slave to be promoted.

It is more than that, actually. You can mark designated servers to be “candidates”, and orchestrator will prefer to promote them on best-effort basis.
To that effect, even if such a candidate server is not the most-up-to-date one, orchestrator will first promote a more fresh replica, converge your candidate instance onto that, and then promote it on top.
There is further ongoing work on further considerations such as conflicting configurations and versions. The most-up-to-date server is not always the one you actually want to promote.
Again, semi-sync can play a major role in this.

AgentAutoDiscover – this actually has to do with orchestrator-agent, which is an altogether different beast and unrelated to auto-discovering your topology. You are likely to keep this variables as false.

> If semi-synchronous replication needs to be used to avoid data loss in case of master failure, this has to be manually added to the hooks as well

There is a contributor working on that right now.

> In order to integrate this in your HA architecture or include in your fail-over processes you still need to manage many aspects manually, which can all be done by using the different hooks available in Orchestrator

Orchestrator tries to be a general-purpose tool which you should be comfortable with deploying regardless of your network/configuration/topology setup. However this pain of “the final steps” of master promotion is shared to everyone. We’re working to suggest a complete failover mechanism that would be generally applicable to many, based on orchestrator and other common open source software.

> The way we understand it right now, one active Orchestrator node will make the decision if a node is down or not. It does check a broken node’s slaves replication state to determine if Orchestrator isn’t the only one losing connectivity (in which it should just do nothing with the production servers).

Correct. Orchestrator analyses the topology as a whole and expects agreement from all servers involved to make sure it is not the only one who doesn’t see right. Read more on: http://code.openark.org/blog/mysql/what-makes-a-mysql-server-failurerecovery-case
The case where an orchestrator node is completely network partitioned is being examined thank to feedback from @gryp.

Thank you for attributing Outbrain, Booking.com and GitHub, who all deserve deep respect for supporting open source in general and the open development of this tool in particular. The people in those companies (DBAs, Sys admins, Devs) have all contributed important feedback and I would like to recognize them for that.
Lastly, the project is open to contributions and there is already a group of awesome contributors, so if you feel like contributing, the door is open!

]]>
By: to nnn https://www.percona.com/blog/orchestrator-mysql-replication-topology-manager/#comment-10965971 Wed, 09 Mar 2016 03:59:23 +0000 https://www.percona.com/blog/?p=33848#comment-10965971 Does the Orchestrator get the old master’s Binlog and merge slave’s Binlog.
Is it as MHA?

]]>