MySQL Cloning: more thoughts

I posted a few days ago some initial thoughts on the the MySQL native cloning functionality. Overall this looks good and I need to spend time to test further.

I’m here in San Francisco ahead of Oracle Open World which starts today. As is usual with trips like this jet lag wakes you up rather early. So I thought I would add some further thoughts on the topic which I think may also be of interest.

Version compatibility: The current clone process requires the source and destination to be identical. That is quite a restriction especially as since early 8.0 there’s a way for MySQL to auto-upgrade from 5.7 on startup. This new upgrade process works very well and makes instance management easier. Thanks Oracle! 8.0.17 is the current GA release as I write and MySQL 8.0.18 will come out soon. When 8.0.18 is released I would really like to be able to install 8.0.18 binaries and clone from 8.0.17, letting 8.0.18 do any upgrade tasks that are needed. Clearly there may be some extra restrictions, but if the source knows how to clone why shouldn’t I be able to do this? Reason: simplifies server management. No need to upgrade the source box first. I can then safely test the new version much more easily and quickly.

Support for other engines: Unless I’m mistaken the current process works for a system running InnoDB only. Consider making it possible to clone when other engines are involved. Again I expect there may be some restrictions (transactional engine? requirement to take snapshots? etc) but the idea I have here is this should be a generic mechanism for cloning and it should not be tied to InnoDB only.

Cloning of multiple instances at once: Is this possible? The documentation is not clear. I suspect that right now only a single instance can be copied at once. That may be the case but be explicit. Why? Initial testing of the cloning process make it look quite fast. I need to do further testing but have seen a 3x speed improvement vs traditional other methods I’ve been using up until now. This looks to be related to dong parallel copies of data between the servers but the end result is faster copying. That can lead me to potentially seeing a clone service, that looks like a MySQL server but behaves the same, and is able to service many clone requests in parallel. Clearly this is much to early to consider too seriously, but I know in systems I manage often multiple copies are being cloned at once, to setup a new cluster or to increase cluster capacity. Doing a single clone is good. Having a way to do multiple clones may be interesting or I might want to build something which provides such a service.

Configuration: 8.0 provides persistent configuration. I have not used this much as most of the systems I manage still run 5.7 or at least the master is 5.7 even if slaves run 8.0. There’s some confusion about the best way to configure MySQL in 8.0: persistent settings are encouraged and in many ways I see us moving to having most settings stored here with a few initial settings staying in the my.cnf file. One thing that I considered if you use persistent settings is that they will also be copied to the clone target server[*]. Some settings which perhaps are “environment specific” such as the buffer pool size, max_connections settings, io threads to use often depend on the specific location of the server. So copying these values to a new server if there were stored as persistent settings will lead to the server to pick them up and might prevent instance startup as they are “wrong” for the new destination host. Might make sense to have a filter list of settings you do not want to copy over to prevent this? This is something I have not actually checked, but potentially could be an issue especially if cloning a larger instance to a smaller test server with less RAM. We will see.

Update 08/Oct/2019: Oracle confirms that persistent settings are not copied to the target when cloning, meaning they need to be configured again on the new instance. I think that an option to copy them over, an option to filter out certain settings, and an option to provide replacement values should all be part of the clone process to make cloning smoother.

clone_buffer_size: this new setting seems fine and the default setting is quite small (4MB). We also have max_allowed_packet. I suspect that a higher value of clone_buffer_size than max_allowed_packet might not work. If that’s the case it should be documented. Indeed a quick test shows I get the following error message: ERROR 1153 (08S01) at line 1: Got a packet bigger than ‘max_allowed_packet’ bytes, so this needs fixing, in documentation but also adding a check to prevent me from setting clone_buffer_size to a value close to or greater than max_allowed_packet.

Clone protocol specification: This new process gets triggered by a simple command CLONE INSTANCE FROM which initiates an interaction between the source server and the server which needs to be cloned. There is logging on both sides of the connection but exactly what happens? To some extent this does not matter too much, but the exact behaviour between the servers is not documented in any way which ensures compatibility with later changes or potentially enhancements from others.

I would like to see a clone instance protocol specification that describes the interaction over the wire between both servers:

  • That would state the steps that are needed to perform the action
  • That would state any specific restrictions on what can be done now which needs to be taken into account to ensure the process works properly.
  • Would provide a way to enhance the process should that be necessary in the future in a backwards compatible way.

Why is this needed? Enhancements to the process may make incompatible changes in behaviour and in the past I have seen this can make it very hard to figure out what they are as the information is in the code and implicit rather than explicit. From a coder’s point of view having a specification to follow is much better than trying to reverse engineer existing code. So while I failed to convince Oracle to document more formally the MySQL X protocol perhaps something which is new and has a much tighter scope might be possible? Who knows.

I look forward to getting a chance to talk with the engineers about these topics and several more and to see the presentations in the MySQL track here at Open World. Maybe see you there.

Published by

Simon J Mudd

Born in England, I now live in Spain, but spent a few years living in the Netherlands. I previously worked in banking (financial markets) both in IT and as a broker, but IT has always had a stronger influence. Now working at booking.com as a Senior Database Administrator. Other interests include photography, and travel. Simon is married, with two children and lives in Madrid.

2 thoughts on “MySQL Cloning: more thoughts”

    1. Hi Geir, thanks for responding so quickly.

      To be honest what I was really thinking about was the over the wire protocol, part of which you document, but something which explains the behaviour, under both valid input and input which is not as expected. I’ve looked at the link you pasted in the past and think it does not quite do this. So think of the RFCs for smtp, http or other “internet protocols”. There is no mention of the software that is at each end just the behaviour between both ends and the format of the data sent between them. This is a subtle difference but one that ensures a clear definition of the behaviour. Often there is a prologue that explains intentions, what MUST or MUST not be done and also what MAY be done and this extra context helps a lot for someone intending to implement the protocol. Later extensions or changes to the protocol then have to be made in a way which is usually compatible and also documented which avoids developers having to dive deep into the source code trying to understand why something has changed and often avoiding unexpected breakages.

      For the clone process I’d like to see the same thing. Document the over the wire formats, the behaviour and flow required to complete the clone but also document behaviour on both valid and invalid inputs.

Leave a Reply