MySQL 5.6 is out, so what is next?

MySQL 5.6 is out now and that is good news. I have already been using pre-GA versions on some production servers with good success and now that the few wrinkles I have seen have been ironed out, I am sure a lot of people will find the new features in 5.6 well worth waiting for.

However, that does lead to the question of: “what next?”

I still have several things that I would like to see in MySQL in no specific order of preference such as:

  • Session transaction state exposed as a variable to allow you to determine if you have started a transaction or not, and thus allowing you to use BEGIN WORK, ROLLBACK or COMMIT as needed.  This information is available via the C API I believe but not via SQL.  Something like @@in_transaction = 1.  Makes modular programming easier.
  • Table space management. The default behaviour now in 5.6 is to move to innodb_file_per_table = 1, but really on a large system I’d prefer to not have to manage each individual file, or partition, but blocks of files containing one or more tables: that is mange table spaces.  This potentially should help reduce the number of file handles that are needed open and allow the DBA to manage the “database” or “instance” in blocks which are of a manageable size to him/her.
  • Better configuration management. If I want to compare the running configuration of the variables in SHOW GLOBAL VARIABLES with the [mysqld] section in /etc/my.cnf then there are quite a lot of inconsistencies in variables/parameter naming, some parameters are not visible in SHOW GLOBAL VARIABLES and vice versa. After seeing that the default performance_schema settings in MySQL 5.6 did not match my needs, I have had to use quite a bit of magic to setup /etc/my.cnf with the correct parameters. If you add the way you have to configure performance_schema then checking for consistency and / or adjusting the running configuration to match /etc/my.cnf is more of a challenge than really necessary. So clean this up please and make life for us easier.
  • More dynamic configuration. Many configuration settings require restarting MySQL.  Now that mysql is no longer a toy and people depend on it 24×7, any downtime is costly and really unhelpful, especially if it’s not necessary.
    • Many replication configuration settings are still static.
    • Many innodb settings are also static (innodb_{read,write}_io_threads, back_log comes to mind), so having to restart mysqld is really not helpful, just to see if a change will improve performance or not.
  • Move the SHOW ENGINE INNODB STATUS output to a tabular form (probably in multiple tables in I_S) so that the information is easy to parse and thus people can better use the information.  Right now if you want to see which threads are busy doing I/O or want information on the actively running transactions, locks or whatever parsing the current output is a real pain. It’s true that much of this information may well be now in P_S but not everything is visible there, and the information that is missing should move to a place where it is easy to extract and use.
  • Better Memory management.  I have some servers with 96 GB of RAM and a certain size innodb buffer pool, and some of these when busy with heavy queries use up too much memory and start to swap, whilst others use the full buffer pool and have plenty of memory left over. The end result is the wrong amount of memory is being used. Workload changes the memory usage completely and the change can make a server swap or run out of memory.  The memory in the system is there to be used by the database yet MySQL does not manage this in any coherent way, or allow you to control it to prevent overuse.    Most other RDBMSes handle memory quite differently, pinning the configured size into RAM and only using that memory and nothing else. This appears to be more efficient and something similar in MySQL would make life a lot simpler.
  • Better binlog/relay log space management.
    • Several servers I manage write out a lot of binlogs in a day, so space is an issue.  Nevertheless, keeping as many binlogs as space allows is also important.  Also many of the applications are configured to large INSERTS, UPDATES including using ODKU (up to our configured max_packet_size of 16M) and whether using RBR or SBR the resulting output is likely to be compressible.  Given that compressing the whole bin log/ relay log files means you can’t jump to any location, it struck me that just compressing the individual events as they are written, or when they are being read might be possible and might be a way to save 2-3 x the binlog space usage. (See: Bug#46435)
    • Related to this point is the binlog retention period.  expire_logs_days, configured as a number of days sounds quite nice until you realise that you may fill the disk with 1 or 2 day’s worth of binlogs, so it would be nice to be able to configure a decimal value here which would make smaller periods possible if needed. That or have a new setting to configure the retention size rather than the period, a new variable such as: binlog_retention_size = 200G.
    • I have filled up many servers disks after cloning a slave as it catches up and pulls over old relay logs from the master, writing them locally before being able to apply them.  There is a configuration setting to limit this size but again it is static.

So these are just a few of the ideas I personally would like to see. I’ll see if I can update the post later providing some of the feature request bug reports as reference.

However, this is just part of the story. I have my ideas of what I would like to see in MySQL 5.7 and you have yours, but there does not appear to be a way currently with MySQL to voice interests to Oracle in a place where different people can be heard and it is possible to see what is most wanted.

  • http://bugs.mysql.com does allow you to add feature requests and indeed I have added many, but finding them or voting on them is not possible. Oracle’s official feature request mechanism makes the request go into a black hole where any progress is completely invisible to the requestor!
  • Oracle obviously has its own agenda and has the inconvenience of competitors who are in the same arena.  That does mean that Oracle, MPAB and Percona therefore have to do their own thing and try to guess what people want (though they’re obviously influenced by their main customers), but we do not see what appears until the new release comes out.  I think that can lead to rough edges as developers do not always realise the real world use cases and problems that the DBA faces on a day to day basis, and thus may focus on stuff which in the end may not be the most important to us.
  • Some Open Source projects have a bounty mechanism where people can offer to pay to get a feature implemented. The idea sounds quite good and if several people want the same thing, they may be able to push towards a common goal more quickly. I am, however, not sure how that might work in practice with something like MySQL, but the idea sounds interesting, if only to help better gauge where there is interest.

My way of using MySQL may vary significantly to yours. My problems may also be quite different, but I would imagine that there is a lot of common ground. So what would you like to see in MySQL 5.7?

In the meantime I guess we will have to wait and see what will come out in 5.7 and where things are going to go. There’s likely I guess to be a 18 month wait for that, so plenty of time to do lots of new and exciting things.  In the meantime MySQL 5.6 needs deploying on new servers once it has been fully tested and any issues resolved and that is quite a bit of work to keep me busy for a few days …

 

Update:

  • 2013-04-08 Adding some more I would likes:
    • A working I_S (mysql data dictionary) to avoid the workarounds needed on servers with hundreds of thousands of tables.
    • A way to compress/uncompress MyISAM tables from the mysql command line. This avoids having to work around locking, blocking and flushing issues if doing this directly from a shell. I’m using compressed MyISAM tables still as they’re much smaller than compressed InnoDB tables (and easier to move around too).

Tags: , , ,

8 Responses to “MySQL 5.6 is out, so what is next?”

  1. Making a strong case for features is step 1. You did a great job with that.

    I would be happy if tablespace management used one file per page size per schema. I want compression for binlogs and relaylog files. I prefer for it to be as online as possible. For my data the per-event compression rate is not good enough.

    Who is doing product management for MySQL now? They should contact you and continue to discuss your feature requests.

    • Simon J Mudd says:

      Hi Mark,

      Thanks for confirming that some of my ideas may be useful to others. While I can not easily test how compressing individual events would work (I still mainly use SBR) a random check of a random binlog does seem to imply that compression would help:

      [root@mymaster log]# time gzip < binlog.003473 > /tmp/binlog.003473.gz

      real 0m15.955s
      user 0m14.945s
      sys 0m0.449s
      [root@mymaster log]# ls -lh /tmp/binlog.003473.gz
      -rw-r–r– 1 root root 74M Apr 3 07:48 /tmp/binlog.003473.gz
      [root@mymaster log]# ls -lh binlog.003473
      -rw-rw—- 1 mysql mysql 1.1G Apr 2 18:12 binlog.003473
      [root@mymaster log]#

      That’s a > 10x compression factor. Even if I only get a 3 x increase that’s a huge win. Yes, this probably should be an optional feature as replication may add an extra delay for i/o to work, but there does look like a huge opportunity for space improvement here. Perhaps RBR compression would not work as well.

      I have talked with Oracle on a couple of occasions, but part of my post here is not about what _I_ want, but figuring out things which may be good for the MySQL community as a whole and allowing us to provide an opportunity to share those needs, and getting the most important needs addressed sooner. You have obviously had to take the time to work alone in getting your needed changes made, but at the cost of having to maintain that “forever”, or until the needed feature makes it into the mainline release. That is costly which is why it has been something that I have tried to avoid.

      The long MySQL release cycles also mean that from having a suggestion to maybe getting it implemented requires waiting perhaps 2 years. That is a long time to have to implement a workaround for a feature which may be useful, and potentially could be implemented quickly. I dislike incompatible changes in a major release so once a version is GA I too expect it to be stable and behaviour to not change in an unexpected way. That leads back to the original point of figuring out which new features make sense and getting as many of these implemented as possible.

  2. Andrew says:

    “Move the SHOW ENGINE INNODB STATUS output to a tabular form (probably in multiple tables in I_S) so that the information is easy to parse and thus people can better use the information.”

    There already is innodb_trx, innodb_lock_waits, innodb_locks, etc. that provide most of the information from show engine innodb status. I use innodb_trx with some frequency when tracking down idle transactions, lock waits, etc. without needing to parse show engine innodb status. These have been there since the InnoDB plugin in MySQL 5.1. MySQL 5.5 added quite a few additional views into the buffer pool, and 5.6 added views into the data dictionary.

    Granted, there’s still information that’s nowhere to be found outside of show engine innodb status, but it’s been some time since I’ve had to parse it out.

    • Simon J Mudd says:

      Hi Andrew.

      Information on i/o threads, and waiting requests for example needs to be parsed. That information shows you when InnoDB is stalled writing or reading and is something that could be much more trivially graphed. If settings like the innodb_{read,write}_io_threads were dynamic (even if the requires a short stall to allow for reconfiguration) and you could trivially see how many threads are busy, or waiting then it would be much easier to “tune” this type of parameter more easily. The output of SHOW ENGINE INNODB STATUS is also currently limited so the data can be unexpectedly truncated if there are a number of large queries. On the servers I manage I often see several heavy inserts, or selects using up to max_packet_size and if data is truncated then you suddenly lose this information. So the I_S.innodb tables are a good start but having the remaining information in other tables is much better.

  3. As for MySQL 5.6, there will be regular Development Milestone Releases of MySQL 5.7. Hence, while waiting for 5.7, you can check out these releases and give us feedback that can help even out some of the rough edges before 5.7 goes GA.

    • Simon J Mudd says:

      That again is good, but I fear that many people will not use 5.7 until it goes GA, so you miss the real-case uses until the feature freeze is in place as that’s the only time you can guarantee doing an upgrade without having to dump+load the whole database. During the DEV phase this guarantee is explicitly revoked which makes sense but when using real data is a handicap.

  4. By not working for my data, I mean that I hacked up code in the slave to compress each event before replay and measured it on data from production. The compression rate for that data wasn’t good. I hoped it would be good because that was an way to compress binlog contents online. I expect some fork to solve this within the next year.

  5. GilG says:

    good post. i would like to see more procedures/functions abilities. right now i can’t even catch a generic exception and know what it was, how trivial is that? compared to plsql for example the language is really missing a lot.

Leave a Reply