MySQL Cloning: more thoughts

I posted a few days ago some initial thoughts on the the MySQL native cloning functionality. Overall this looks good and I need to spend time to test further.

I’m here in San Francisco ahead of Oracle Open World which starts today. As is usual with trips like this jet lag wakes you up rather early. So I thought I would add some further thoughts on the topic which I think may also be of interest.

Version compatibility: The current clone process requires the source and destination to be identical. That is quite a restriction especially as since early 8.0 there’s a way for MySQL to auto-upgrade from 5.7 on startup. This new upgrade process works very well and makes instance management easier. Thanks Oracle! 8.0.17 is the current GA release as I write and MySQL 8.0.18 will come out soon. When 8.0.18 is released I would really like to be able to install 8.0.18 binaries and clone from 8.0.17, letting 8.0.18 do any upgrade tasks that are needed. Clearly there may be some extra restrictions, but if the source knows how to clone why shouldn’t I be able to do this? Reason: simplifies server management. No need to upgrade the source box first. I can then safely test the new version much more easily and quickly.

Support for other engines: Unless I’m mistaken the current process works for a system running InnoDB only. Consider making it possible to clone when other engines are involved. Again I expect there may be some restrictions (transactional engine? requirement to take snapshots? etc) but the idea I have here is this should be a generic mechanism for cloning and it should not be tied to InnoDB only.

Cloning of multiple instances at once: Is this possible? The documentation is not clear. I suspect that right now only a single instance can be copied at once. That may be the case but be explicit. Why? Initial testing of the cloning process make it look quite fast. I need to do further testing but have seen a 3x speed improvement vs traditional other methods I’ve been using up until now. This looks to be related to dong parallel copies of data between the servers but the end result is faster copying. That can lead me to potentially seeing a clone service, that looks like a MySQL server but behaves the same, and is able to service many clone requests in parallel. Clearly this is much to early to consider too seriously, but I know in systems I manage often multiple copies are being cloned at once, to setup a new cluster or to increase cluster capacity. Doing a single clone is good. Having a way to do multiple clones may be interesting or I might want to build something which provides such a service.

Configuration: 8.0 provides persistent configuration. I have not used this much as most of the systems I manage still run 5.7 or at least the master is 5.7 even if slaves run 8.0. There’s some confusion about the best way to configure MySQL in 8.0: persistent settings are encouraged and in many ways I see us moving to having most settings stored here with a few initial settings staying in the my.cnf file. One thing that I considered if you use persistent settings is that they will also be copied to the clone target server[*]. Some settings which perhaps are “environment specific” such as the buffer pool size, max_connections settings, io threads to use often depend on the specific location of the server. So copying these values to a new server if there were stored as persistent settings will lead to the server to pick them up and might prevent instance startup as they are “wrong” for the new destination host. Might make sense to have a filter list of settings you do not want to copy over to prevent this? This is something I have not actually checked, but potentially could be an issue especially if cloning a larger instance to a smaller test server with less RAM. We will see.

Update 08/Oct/2019: Oracle confirms that persistent settings are not copied to the target when cloning, meaning they need to be configured again on the new instance. I think that an option to copy them over, an option to filter out certain settings, and an option to provide replacement values should all be part of the clone process to make cloning smoother.

clone_buffer_size: this new setting seems fine and the default setting is quite small (4MB). We also have max_allowed_packet. I suspect that a higher value of clone_buffer_size than max_allowed_packet might not work. If that’s the case it should be documented. Indeed a quick test shows I get the following error message: ERROR 1153 (08S01) at line 1: Got a packet bigger than ‘max_allowed_packet’ bytes, so this needs fixing, in documentation but also adding a check to prevent me from setting clone_buffer_size to a value close to or greater than max_allowed_packet.

Clone protocol specification: This new process gets triggered by a simple command CLONE INSTANCE FROM which initiates an interaction between the source server and the server which needs to be cloned. There is logging on both sides of the connection but exactly what happens? To some extent this does not matter too much, but the exact behaviour between the servers is not documented in any way which ensures compatibility with later changes or potentially enhancements from others.

I would like to see a clone instance protocol specification that describes the interaction over the wire between both servers:

  • That would state the steps that are needed to perform the action
  • That would state any specific restrictions on what can be done now which needs to be taken into account to ensure the process works properly.
  • Would provide a way to enhance the process should that be necessary in the future in a backwards compatible way.

Why is this needed? Enhancements to the process may make incompatible changes in behaviour and in the past I have seen this can make it very hard to figure out what they are as the information is in the code and implicit rather than explicit. From a coder’s point of view having a specification to follow is much better than trying to reverse engineer existing code. So while I failed to convince Oracle to document more formally the MySQL X protocol perhaps something which is new and has a much tighter scope might be possible? Who knows.

I look forward to getting a chance to talk with the engineers about these topics and several more and to see the presentations in the MySQL track here at Open World. Maybe see you there.

MySQL 8.0.17+: Cloning is now much easier

If you use replication with MySQL, if you need a backup, if you need a spare copy of a system for testing and for many other reasons you need a way to make a copy of your MySQL system. In the past you could make a copy in various ways:

  • using a cold file system copy with MySQL first shutdown
  • using MySQL Enterprise Backup, Xtrabackup from Percona or MariaBackup (an Xtrabackup fork). These work against an online running server. However, they required special binaries to run on the source in addition to the traditional mysqld.
  • copying the LUN to a new server

There are many ways to do this and they all work and we have been doing this for so long that it feels normal, but clearly a mysqld to mysqld copy using the normal mysql port and protocol just simplifies things considerably. That is what is available now in 8.0.17 and later with the new clone plugin.

So I wanted to see how easy it was to set this up and how it worked. One easy way for on your laptop testing is to use dbdeployer as that allows you to spin up one or more instances quickly and is great for all sorts of testing. I’d almost like to see this used for normal MySQL provisioning and management but that’s a topic for a different blog post.

So when I found out about the new cloning method I wrote a quick script to see what effort was involved in setting up and cloning between servers. Take a look at the script to see what’s needed to set things up, and at the sample output when this happens. Thanks go to Giuseppe Maxia for showing how to easily achieve such a setup with dbdeployer.

Summary: once configured you can clone a server with a single SQL command. That is certainly much simpler than the methods that have been available until now.

So are there any gotchas? I found a few things:

  • bug 96281 – Can’t initialize function ‘clone’; Plugin initialization function failed (broken if sql_require_primary_key = 1), appears to be resolved in 8.0.18
  • bug 96302 – support CIDR notation for clone-valid-donor-list (though I’m not convinced this setting is actually useful)
  • bug 96305 – make the clone port optional and default to 3306 (default mysql port)
  • bug 96837 – documentation is a bit confusing

None of these are critical. The main concern I see is the backup_lock that gets taken on the source and which prevents certain operations from continuing normally. If this is to be used to copy from a primary master I see that as problematic.

As for how fast this works compared to other methods I’ll leave that for a later post.