The differences between IDEMPOTENT and AUTO-REPAIR mode

I posted recently Lossless RBR for MySQL 8.0 about a concern I have about moving to minimal RBR in MySQL 8.0.  This seems to be the direction that Oracle is considering, but I am not sure it is a good idea as a default setting.

I talked about a hypothetical new replication mode lossless RBR and also about recovery after a crash where perhaps the data on the slave may get out of sync with the master. Under normal circumstances this should not happen but in the real world sometimes it does.

Note: I’m talking about an environment that does not use GTID.  GTID is good but may have its own issues and it’s probably best to leave those discussions to another post.

So let us talk about the difference between IDEMPOTENT mode (slave_exec_mode=IDEMPOTENT) and what I’ll call AUTO-REPAIR mode, mentioned in feature request bug#54250 to Oracle in 2010.  By default the DBA wants to avoid any data corruption, so this should be the default behaviour. Thus I’d prefer auto-repair mode to be off by default, stopping replication if any inconsistencies are found. I could enable it if I see such an issue as it should help me recover the state of the database without adding further “corruption” to the slave.

If I’m confident that this procedure works fine and I’m monitoring the counters mentioned below then it may be fine to leave enabled all the time.

A slave fails, it may crash and it recovers. It’s likely that the replication position it “remembers” is behind the actual state in the database.

If we use full RBR (default setting) in these circumstances then we may get in a set of changes which the SQL thread tries to apply.

They’ll be in the form of:

before row image / after row image

before row image / after row image

where each row image is the set of column values prior to and after the row changes.  Traditionally we use the abbreviations BI and AI for this.

Currently the SQL thread will start up and look for the first row to change and once it has found it change it.  If the exact matching conditions it needs can not be found then an error will be generated and replication stops.

IDEMPOTENT mode attempts to address this and tries to “continue whatever the cost”. To be honest I’m not exactly sure what it does, but it’s clear that it will either do nothing or perhaps it might try to find the row by primary key and update that row. I’d expect it probably does nothing.

See a comment later on.  So I did go and check and the comments in slave_exec_mode say that it suppresses duplicate-key and no-key-found errors. There is no mention of updates where the full AI is unavailable. (e.g. when using minimal RBR)

It also looks like it does not “repair” the issue, but it simply ignores it. The documentation is not 100% clear to me.

I made a comment about different options for AUTO-REPAIR mode and when it can work and when it can not. In FULL RBR mode it should always be able to do something. In MINIMAL RBR mode there will be cases when it can not. Let’s see the case of FULL RBR mode:

  1. For an UPDATE when the requested row can not be found:
    • auto-repair mode would INSERT the row. You have a full AI so you can do this safely.
    • A counter should be updated to record this action.
  2. For a DELETE row operation when the row can not be found:
    • auto-repair mode would ignore the error and given the row does not exist anyway the effect of the DELETE has already been accomplished.
    • A counter should be updated to record this action
  3. For an INSERT row operation when the ROW already exists.
    • Duplicate key insert) This is what generally breaks replication.
    • auto-repair mode would treat this as an UPDATE operation (based on the primary key in the table) and ensure the row is changed to have the same primary key and the columns of the AI.
    • Again a counter should be updated to record this action.

In FULL RBR mode these 3 actions should allow replication to continue. The database is no more corrupt than it was before. In fact it’s in a state that’s somewhat better.

In many cases other row events will proceed as expected without issue:  INSERTS will happen, UPDATES and DELETEs to existing rows will work as the row is found, and things will proceed as normal.

So should we get in a situation like this we can check the 3 counters and this gives us a clue as to the number of “repair actions” which MySQL has had to execute.  It also gives us an idea of how inconsistent the slave seems to be, though those inconsistencies should now have been removed.

As I said I can’t remember exactly what IDEMPOTENT mode does in these 3 circumstances.  It may do something similar to my AUTO-REPAIR mode or it may just skip the errors.

Why don’t I know?  Well I’m currently in a plane and the mysql documentation is not provided with my mysql server software and I’m not online so I can’t check.  I used to find the info file or a pdf of the manual quite helpful in such situations and would love to see it put back again so I don’t need to speculate about what the documentation says.

Yes, I could update this text when I’m back online, but I think I’ll make the point and leave this paragraph here.

So with FULL RBR the situation seems to me to be clear. IDEMPOTENT mode may not do the same thing as the AUTO-REPAIR mode, and whether it does or not there are no counters to see the effect it produces on my server. So I’m blind. I do not like that.

Let’s change the topic slightly and now switch to MINIMAL RBR and do the same thing. In theory now IDEMPOTENT mode and AUTO-REPAIR mode may seem to be the same (assuming IDEMPOTENT mode changes what it can) but that’s also not entirely true.

With minimal RBR mode we get a set of  primary key plus changed columns for each row that changes. For INSERTS we get the full ROW and for DELETES we only need the primary key. That should be enough.

What changes here are the UPDATES: as if we don’t get the full row image we can not know what was in the table before. We only have information on the new data.  So other columns which are not mentioned are unknown to us. If we are UPDATING a row and we can not find it, an INSERT is not possible as we do not have enough information to complete the columns that are unknown to us. So replication MUST stop if we want to avoid corruption.

Additionally, with minimal RBR UPDATES even if you find the ROW to UPDATE you can not be sure you are doing the right thing as you have no reference to the content or state of the before image. My thought here was that the ideal thing would be to send with each row a checksum of the row content on the master.  This would be “small” (so efficient) and could be checked against the row content on the slave prior to making the update.  If the values match we know the RBR UPDATE is working on expected data.  This makes a DBA feel more comfortable.

Table definitions on a master and its slaves are not always identical.  There are several reasons for this such as the fact that different (major) versions of MySQL are being used, or simply due to it being impossible to take downtime on the server some sort of out of band ALTER TABLE may have been run on the slave and that change is still pending on the master. The typical case here is adding new columns, or changing the type, width, character set or collation of an existing column. In these circumstances the binary image on the master and slave may well not be the same so the before row image “checksum” on the master would not be usable.  To detect such a situation it may be necessary to also send a table definition checksum with the row before image checksum, though this could be sent for each set of events on a table not each row. The combination of the two values should be enough to allow us to be ensure that minimal RBR changes can be validated even if we do not push down a full before image into the binlog stream. Again, if the definitions do not match it would seem sensible to update a counter to indicate such a situation.  We probably do not want to stop replication in this situation. Those who do not expect any sort of differences between master and slave may be paranoid enough to want to not continue, but I know for my usage I’d like to monitor changes to the counter but probably just continue.

Even my proposed LOSSLESS RBR would need this checksum to be safe as it would not contain the full before image but only the PK + all columns for an UPDATE operation, so potentially “slave drift” might happen and go undetected.

I can see therefore that optionally being able to add to minimal- and lossless-RBR such checksums would be a good way to ensure that replication works safely and pushes out changes to the slaves which are expected, and catches unexpected inconsistencies. 

The additional counters mentioned would help “catch” the number of inconsistencies that take place and they would be good even with the current replication setup when IDEMPOTENT mode is used. This lack of visibility of errors should make most DBAs rather sleepless, but I suspect there are those that are not aware and those that just have to live without that knowledge. Having these extra counters would help us see when things are not the same and allow us to take any necessary action based on that information should it be necessary.

I hope with this post I have clarify why IDEMPOTENT mode is not the same as my suggested AUTO-REPAIR mode and when it’s safe to continue replicating and when it is not under a variety of different conditions which would normally make RBR stop.

It also seems clear to me that MINIMAL RBR would benefit from some additional checksums to allow the DBA to be more confident that the changes being made on the slave match those made on the master.  This is especially so if using minimal RBR.

As always comments and feedback on this post is most welcome.

Lossless RBR for MySQL 8.0?

Lossless RBR

TL/DR: There’s been talk of moving the next release of MySQL to minimal RBR: I’d like to suggest an alternative: lossless RBR

For MySQL 5.8 there was talk / suggestions about moving to minimal RBR as the default configuration (http://mysqlserverteam.com/planning-the-defaults-for-mysql-5-8/).  I’m not comfortable with this because it means that by default you do not have the full data in the binlog stream of the changes to a table. 

The use of minimal RBR is an optimisation, it is done deliberately in busy environments where the size of written data is large and it is not convenient/possible to keep all data.  Additionally the performance of full RBR can in some cases be an issue especially for “fat/wide” tables etc. It is true that minimal RBR helps considerably here.  There are several issues it resolves:

  • reduces network bandwidth between master and slaves
  • reduces disk i/o reading / writing the binlog files
  • reduces disk storage occupied by said binlogs

There was also a comment about enabling IDEMPOTENT mode by default on a slave.

This is a mode which basically ignores most errors. That does not seem wise.  As a DBA by default you want the server to not lose or munge data. There are times when you may decide to forgo that requirement, but the DBA should decide and the default behaviour should be safe.

Thus the idea of lossless RBR came to mind. What would this involve compared to the current modes of FULL or MINIMAL RBR?

  1. INSERTs are unchanged (as now): you get the full row
  2. DELETEs are as per minimal RBR: The primary key is sent and the matching row is removed.  IFF on a slave the pks differed and more than one row would be deleted this should be treated as an error.
  3. UPDATEs: Send the pk + full new image, thus ensuring that all new data is sent. This reduces the event size by ~ 1/2 so would be especially good for fat tables and tables where large updates go through. If the PK columns do not change then it should be sufficient to send the new row image and pk column names etc

Related to this behaviour it would be most convenient to implement an existing FR (bug#69223) to require that table definitions via CREATE/ALTER TABLE MUST HAVE a PK.  I’ve seen several issues where a developer has not thought a primary key was important (they often forget replication) and this would trigger problems. Inserts would work fine but any updates that happened afterwards would trigger a problem, not on the master but on all slaves.  I think that by default this behaviour should be enabled.  There may be situations where it needs to be disabled but they are likely to be rather limited.

This new mode LOSSLESS RBR is clearly a mix between full and minimal and it ensures that data pushed into a slave will always be complete.  think that is a better target to aim for with MySQL 8.0 than the suggested MINIMAL RBR.

You may know that I do not like IDEMPOTENT mode much. I have created several FRs to add counters to “lost/ignored events” so we can see the impact of using this mode (usually it is used after an outage to keep replication going even if this may mean some data is not being updated correctly.  Usually this is better than having a slave with 100% stale data.)

I would really also like to see you adding a “safe recovery mode” where statements which won’t damage the slave more are accepted.

The examples are in bug#54250 but basically include:

  1. INSERTs with duplicate key: convert to UPDATE
  2. DELETEs with row not found: ignore as the data has gone anyway
  3. UPDATEs with non-matching PK: convert to INSERT
  4. UPDATEs with non-matching columns: update what you can.  (This is likely to happen with full RBR as minimal RBR should never generate this type of error.)

[ For each of these 4 states: add counters to indicate how many times this has happened, so we can see if we’re “correcting” or “fixing” errors or not. ]

You’ll notice that lossless RBR would work perfectly with this even after a crash as you’ll have all the data you need, so you’ll never make the state of the database any worse than it was before.

I would like to see the FRs I’ve made regarding improving RBR being implemented as whether lossless RBR becomes a new replication mode or not they would help DBAs both diagnose and fix problems more easily than now.

It is probably also worth noting that FULL RBR is actually useful for a variety of scenarios, for example for exporting changes to other non-MySQL systems.  We miss for this the definition of tables, and current systems need to extract that out of band which is a major nuisance.  Exporting to external systems may not have happened that frequently in the past, but as larger companies use MySQL this becomes more and more important. For this type of system FULL RBR is probably needed even though it may not be used on the upstream master. I would expect that in most cases LOSSLESS RBR would also serve this purpose pretty well and reduce the replication footprint. The only environment that may need traditional FULL RBR is where auditing of ALL changes in a table is needed and thus both the before and after images are required.

Is it worth adding yet another replication mode to MySQL?  That is a good question and it may not be worth the effort.  However the differences between FULL and LOSSLESS RBR should be minimal: the only difference is the amount of data that’s pushed into the binlog so the scope of changes etc should be more limited.  Improving replication performance seems to be a good goal: we all need that, but over-optimising should be considered more carefully.  I think we are still missing the monitoring metrics which help us diagnose and be better aware of issues in RBR and the “tools” or improvements which would make recovery easier. Unless you live in the real world of systems which break it is hard to understand why these “obscure” edge cases matter that much.  The responses like: “just restart mysqld” may make sense in some environments, but really are not realistic in systems that run 24x7x365. With replication it is similar: stopped replication is worse than replication that is working, but where data may not be complete.  Depending on the situation you may tolerate that “incomplete data” (temporarily) while gaining the changes which your apps need to see.  However, it is vitally important to be able to measure the “damage” and that is why counters like the ones indicated above are so vital. It allows you to distinguish 1 broken row, or 1,000,000 and decide on how to prioritise and deal with that as appropriate.

While I guess the MySQL replication developers are busy I would certainly be interested in hearing their thoughts on this possible new replication mode and would definitely prefer it over the suggested minimal RBR as a default for 8.0.  Both FULL and MINIMAL RBR have their place, but perhaps LOSSLESS would be a better default?  What do you think?