A Couple of MySQL 5.7 gotchas to be aware of

MySQL 5.7 GA was released a couple of months ago now with 5.7.9 and 5.7.10 has been published a few days ago.  So far initial testing of these versions looks pretty good and both versions have proved to be stable.

I have, however, been bitten by a couple of gotchas which if you are not aware of them may be a bit of a surprise. This post is to bring them to your attention.

New MySQL accounts expire by default after 360 days

This is as per documentation, so there is no bug here. MySQL 5.7 provides a new more secure environment. One of the changes is to add password expiry and the default behaviour is for passwords expire after 360 days.  This seems good, but you, perhaps like me, may not be accustomed to managing your passwords, checking for expiration and adjusting the MySQL user settings accordingly.  The default setting of default_password_lifetime is 360 days, so after upgrading a server to MySQL 5.7 from MySQL 5.6 this setting suddenly comes to life. The good thing is nothing happens immediately so you do not see the time bomb ticking away. I had have been testing the DMR versions of MySQL 5.7 earlier to the GA release and consequently using it for longer than 2 months.  Recently a couple of 5.7.9 servers which had been upgraded from 5.6 a year ago decided to block access to all applications at the same time.  The quick fix is simple: change the default setting to 0 (no expiry) and we have a configuration that behaves like MySQL 5.6 even if it less secure than the default MySQL 5.7 setup. We can then look at how to manage the MySQL accounts and take this new setting into account in a more secure manner.  If you are starting to use MySQL 5.7 and are not migrating from 5.6 then perhaps you’ll put in the right checks in place when you start, but those of us migrating from 5.6 can not push down grants with the new ALTER USER syntax until the 5.6 masters are upgraded so we need to pay more attention to this while in the progress of migration.

New range optimizer setting might cause unexpected table scans if not set properly

MySQL 5.7.9 GA added a new configuration variable: range_optimizer_max_mem_size, set by default to 1536000. The documentation does not say much about this new setting and seems quite harmless. “if … the optimizer estimates that the amount of memory needed for this method would exceed the limit, it abandons the plan and considers other plans.”  The range optimiser is used for point selects, primary key lookups and other similar queries.  What this setting does is after parsing a query look at the number of items which may be referenced in a WHERE clause and if the memory usage is too high fall back to a slower method.

Let’s put this into context. A query like SELECT some_columns FROM some_table WHERE id IN ( 1, 2, 3, ... big list of ids ... 99998, 99999 ) will trigger this limit being reached for a large enough range of ids. DELETE FROM some_table WHERE (pk1 = 1 AND pk2 = 11) OR (pk1 = 2 AND pk2 = 12) .. OR .. (pk1 = 111 AND pk2 = 121) /* pk1 and pk2 form a [primary] key */ would also potentially trigger this.

The questions that come out of this are (a) “How to figure out the point at which this change happens?”, and (b) “What happens at this point?”

The answer to (b) is simple: MySQL falls back to doing a table scan (per item). The answer to (a) is not so clear. Bug#78752 is a feature request to make this clearer, and further investigation pointed to MySQL 5.6’s previous behaviour where the limit was defined in terms of a fixed number of hard-coded “items” (16,000), whereas 5.7’s new behaviour is in terms of memory usage.  The relationship between the two settings is not very clear and initial guestimates on systems I saw issues with seems to indicate that maybe 4kB per item is used by MySQL 5.7 at the moment. The point is that what worked quickly as point selects on 5.6 may fall back to table scans per item in 5.7 if the number of entries is too high, and this would require a reconfiguration (it is dynamic) of the configuration setting mentioned. The bad behaviour may also only happen depending on the size of the query.

Many people may wonder why anyone would be mad enough to use a SELECT or DELETE statement with several thousand entries in an IN () clause, but this comes from having split data in a single server into two and making the application find a list of ids from one server using some criteria and then using the ids obtained in a different one. I see that pattern used frequently and it is probably a common pattern on any system where data will no longer fit in a single server.

The problem with this particular change in behaviour is that point selects are very fast and efficient in MySQL. People use them a lot. Table scans are of course really slow, so depending on the query in question performance can change from ms to minutes just because your query is a tiny bit bigger than the new threshold. In practice it looks like the old hard-coded limit and the new dynamic limit are at least an order of magnitude different in size so it is quite easy to trip up on good queries in 5.6 failing miserably in 5.7 without a configuration change. Again while migrating from MySQL 5.6 to 5.7 you may see this change bite you.

You may get caught by either of these issues. I got caught by both of them while testing 5.7 and while the solutions to resolve them are quite simple to fix they do require a configuration change to resolve the issue. I hope this post at least makes you recognise them and know where to poke so you can make your new 5.7 servers behave properly again.

Published by

Simon J Mudd

Born in England, I now live in Spain, but spent a few years living in the Netherlands. I previously worked in banking (financial markets) both in IT and as a broker, but IT has always had a stronger influence. Now working at booking.com as a Senior Database Administrator. Other interests include photography, and travel. Simon is married, with two children and lives in Madrid.

7 thoughts on “A Couple of MySQL 5.7 gotchas to be aware of”

  1. Thank you for pointing out these two gotchas Simon! The first one is about finding the right balance between ease of use (or “no surprises”) and secure by default (and we prioritized security). The second one is about improved resource control and finding the right thresholds. We will take your findings and comments here into account when we continue to improve both on security and on resource control. And thank you for qualifying MySQL 5.7 in a tough production environment!

  2. Thanks for the post. We are working on the second problem and will
    provide a solution soon.

    The reason for seeing the change in plans from range scan to table
    scan for certain queries is because of the new cap on memory
    utilization of range optimizer in 5.7 which you have explained above.

    In 5.6, range optimizer checks for excessive memory consumption
    in a minimalistic way by checking on the MAX_SEL_ARGS which was hard
    coded.
    This leads to two problems,
    1.The above limit was checked only for queries that were written in a
    certain way.
    For ex: In the following case,

    SELECT …
    FROM t1
    WHERE
    (!(f1 BETWEEN val1 and val2 ) )

    range optimizer utilizes excessive memory because the hard coded
    limit was not checked for this case. This meant existing check was
    not sufficient.

    2. Second one being, if a where condition is written in a different
    way like in the following example

    select * from t where c1 in
    (1,2,3,4,….a 100 values) and c2 in
    (1,2,3,4,5, …..a 200 values);

    range optimizer might consume memory, but it would have generated a
    plan which would have been much more efficient than the current plan
    which is a table scan.

    In 5.7, both these problem are addressed with the new configurable
    limit on memory for range optimizer. However, we see that this has
    become more restrictive than we had anticipated.
    Reason being, allocated objects in case of “OR” condition were larger
    than what we had expected. This resulted in allocation for “AND”
    conditions and “OR” conditions differ by more memory which in turn
    makes the current default fail when too many “OR” conditions are
    present.

    However this means that, we have to improve the memory utilization of
    range optimizer and define a more suitable default value. We are
    currently working on improving memory consumption in case of “OR”
    conditions. With this improvement, we should be able to make the
    current default for range_optimizer_max_mem_size work better. Also,
    this can help us in coming up with a more suitable default too.

    Hoping that this will address your problem.

    1. Thanks for clarifying the cause of what I was seeing in detail. I certainly look forward to seeing any improvements especially if they reduce memory usage and also mainly avoid the need for extra 5.7 specific configuration settings.

  3. Small correction in the query used to explain excessive memory usage:

    SELECT …
    FROM t1
    WHERE
    (!(f1 BETWEEN val1 and val2 ) …. repeated over 1000 times)

Leave a Reply