MySQL 5.6, GTID and performance_schema

October 9th, 2012

Not much to add really to the bug I’ve filed here: bug#67159.

Again this GTID stuff looks good, but seems to prevent changes in the configuration of performance_schema, which I think is not appropriate, especially as P_S now has lots of extra goodies and after 5.6 will surely have even more.

MySQL-5.6, GTID and binlogs on slaves

October 5th, 2012

Not much to add really to the bug I’ve filed here: bug#67099.

I personally can think of some very nasty consequences of applying this on the slaves I manage, and the reason I’m posting the bug is that while I guess this is too late to fix in 5.6 as it’s effectively a new feature, I’m sure many sites may bump into this and be somewhat disappointed if they want to use the new GTID feature and have several slaves.  Hence, if the fix/feature has to go into MySQL 5.7 then I hope it goes in sooner rather than later. We will see.

Updated: 2013-09-19

I probably should have updated this earlier but it does seem that Oracle have taken these comments on board. See: WL6559.  It looks like they plan to do this for 5.7 which is good news. I’m really hoping that we will not have to wait too long for this version as it has other “goodies” which are going to be interesting. Reducing the time of the release cycle will I think allow those of us who appreciate these new features to have access to them sooner.

More GTID troubles in MySQL 5.6

October 4th, 2012

A colleague, Kris, commented recently on a recent find I made on 5.6.7-rc1 which while not the end of the world does seem to severely limit the use of the new GTID features, or if nothing else make it a pain to use.

Today I found another related issue, when trying to check the syntax of one of the MySQL commands. That’s now reported as bug#67073.

All of these are sort of edge cases but do make 5.6 look a little rough round the edges, when the rest of the product is actually really good and a great improvement from The MySQL 5.5 GA version that everyone uses.  That’s really a shame.

I did report the Original problem as SR 3-6270525721 : MySQL 5.6.7-rc1, grants, replication and GTID cause problems, but of course that’s not visible to the general MySQL community and while this specific issue with the help tables has been reported in bugs.mysql.com I do get the feeling that other users of MySQL may report issues which are not visible to the community as a whole and that means that solutions, workarounds or even the possibility of saying “I have that problem too” is no longer possible.

For many tickets I’ve chosen to make 2 tickets, one on MySQL’s http://bugs.mysql.com and another internal Oracle ticket but this is really a nuisance and time waster, and in the end I don’t believe it’s helpful to anyone.  I’m sure I’m not the only person who suffers from this.  I also note that when browsing for what’s changed in the lastest release such as http://dev.mysql.com/doc/refman/5.6/en/news-5-6-7.html you see references to the internal bug numbers which many people can not even access. I am able to see most of these tickets but many people are not so the problem and resolution become rather opaque. That’s counterproductive for everyone. Improving this reporting of changes would help us all.

Oracle, please make this work in an easier manner so that issues with mysql software can be more visible (if the reporter choses to make this so), information on changes and bug fixes is also more complete, as this will save time and frustration for everyone.

Note: Opinions expressed above are my own.

Checking /proc/pid/numa_maps can be dangerous for mysql client connections

September 8th, 2012

I’ve blogged before about the way to use numactl to start up mysqld, and thus to try to better spread the memory usage on larger memory servers. This came from an article by Jeremy Cole and is fine. I recently had some issues with mysqld seeming to run out of memory on a box which appeared to have plenty free, so it seemed like a good idea to adapt a minutely collector script I run to include the numa_maps output so that I could see if the failed memory was related to this. So far so good.

Many of the clients that connect to the database servers I manage have a very short connect timeout, typically 2 seconds. In a normal network and under normal conditions this is more than enough to allow for successful operation and if the connect does not work in that time it’s an indication of some underlying issue, whether that be load on the mysqld server, or something else.

The change I implemented on to collect the numa_maps information wasn’t expected to cause any issues, and typically quite a lot of information is cat’d out of /proc/somewhere to collect configuration or status information.

However, after implementing this change I suddenly noticed more mysql client connection errors than before, in fact a significant change. It wasn’t immediately apparent what the cause was until it was noticed that these errors only occurred a couple of seconds after the change in minute, at hh:mm:02, hh:mm:03 etc. Then it dawned on me that indeed this was due to the change in the collection script looking at /proc/<mysqld_pid>/numa_maps.   Disabling this functionality again removed all issues. This was with servers with 192 GB of RAM.

The information provided by /proc is useful and it would be even better if the information could be collected in a way which doesn’t block the process that’s being “investigated”. As such I have filed a bug report with RedHat, though really this is just a kernel “bug”, or behaviour which while it may be known, in this particular case provides a very unsatisfactory behaviour.

So whilst most people do not configure such a short mysql connection timeout if you do have a setting like this please be aware of the consequences of running numa-maps-summary.pl, or looking at /proc/<mysqld_pid>/numa_maps directly. This was not discussed by Jeremy and I’d guess he was unaware of this, or did not expect people to run this type of script as frequently as I was. Database memory sizes keep increasing so this may not have been noticeable on smaller servers but can now become an issue.

Initial Reactions to MySQL 5.6

April 28th, 2012

New versions of MySQL are always interesting to try out. Often they have features which I may have asked for myself so it’s satisfying to see them eventually appear on a system I use. Often other new features make life easier for the DBA. Finally we hope overall performance will improve and managing the server(s) will be come easier.

So I had a system which needs to make heavy writes, and performance was a problem, even when writing to SSDs. Checkpointing seemed to be the big issue and the ib_logfile size in MySQL 5.5 is limited to 4 GB. That seems a lot, but once MySQL starts to fill these files (and this happens at ~70% of the total I believe),  checkpointing kicks in heavily, and slows things down.  So the main reason for trying out MySQL 5.6 was to see how things performed with larger ib_logfiles. (Yes, MariaDB 5.5 can do this too.)

Things improved a lot for my specific workload which was great news, but one thing which I noticed was that initial install / setup time of a system with these large files increased a huge amount. (I was using 2 x 16000M files.)  That was noticeable even on a box with SSDs. On a box with HDs that would be horrible.

mysql_install_db now uses innodb to setup some system things and restarts the mysqld binary a couple of times. When you also have a large innodb_buffer_pool (I was trying on a box with 192GB of RAM) the startup of mysqld each time took quite a while.

So initial thoughts are that the bootstrapping probably does not need to use the normal innodb_buffer_pool_size setting (but should perhaps overwrite it with the minimal size needed).

Managing the size of the ib_logfiles is also something that requires restarting mysqld. Given the sort of experimenting that I was doing it would be nice to be able to dynamically configure this.  I can see many issues with changing the sizes of the existing files, but what seems easier to implement would be to be able to increase or decrease the number of files used (removing the older, higher number files once they are “used”/checkpointed), and that would provide a lot more freedom to the DBA.

I clone servers quite a bit and the extra 32 GB of these redo log files is a pain to clone/copy, and time consuming, so being able to reduce the number of files, and checkpoint prior to shutting the server down for cloning, but then re-enabling the normal value afterwards would save a lot of time cloning. So would be a handy feature.

I also tried playing with the new GTID features.  They look great. Except for one thing.  This expects master and slaves to have binlogs enabled, so that the GTID values are kept locally, and on startup.  I manage several boxes where for performance reasons the slaves do not have binlogs enabled.  If it dies the slave gets recloned.  Adding back the requirement for binlogs ( and log_slave_updates ) on a slave to enable this feature seems wrong. I can see the need that the different GTID values are kept somewhere, but don’t see the need to keep all binlog information, at least on a slave. Given the ratio of slaves to masters can be quite high that can be an issue.  If necessary write this information somewhere else, why not in an InnoDB table, so that on sever recovery you have consistent information with the rest of the database, something that might not happen with the binlogs… ?

Talking of binlogs one new feature I’m using, which I think is good is: binlog_row_image = minimal. This reduces the size of the binlogs and thus I/O.  I’m missing the next step which would be to compress those binlog events, and reduce binlog size further. That may not be an issue on many systems but several servers I manage write over 100 GB of binlogs a day. Reducing this further by being able to compress the binlogs would be nice, and having better control of expire_logs_days too (as the daily granularity can be too large in many cases for me) would help.

NOTE: In MySQL 5.6.5 with gtid_mode = ON , mysql_install_db crashes! Bug reported to Oracle, and I guess will get fixed soon.

I have seen a few other crashes in 5.6.4 and 5.6.5, I can’t provide links unfortunately as they’re not public. They are still not resolved.  I’d like to make the details of these crashes public as others may come across them, or have come across them before me, but these issues are not visible in http://bugs.mysql.com. Would be nice if they were but requires me duplicating information which is a pain.  Oracle, please make this easier. It is good for us all.

So with the little testing I’ve done so far MySQL 5.6 looks good. I need to spend more time with it and investigate the new features, many of which will make life easier.

When it finally goes GA I will have to spend a lot more time seeing how it behaves but I believe there are still a few issues which still need resolving prior to that happening.

Make InnoDB redo log file size configuration more flexible

March 30th, 2012

Over the last few months, on different systems I have been modifying the ib_log_files and their sizes, increasing them due to the increase in load they have been experiencing. These files are used to contain information which has been committed, but not yet written to the ibdata files or the individual .ibd files if using innodb_file_per_table.  And these files are used on recovery if the server shuts down unexpectedly during recovery processing to get the system back into a consistent state again.

The checkpointing in MySQL is well documented, but I was surprised just how much difference the throughput can change for heavy write applications if the sizes of these files is changed.  It may seem obvious but given how disruptive it can be to shut down a system, change these values and start it up again, this process is not done frequently.  These files help improve performance as changes are written linearly into the file, compared to the changes in the data files which are written in pages and those pages are probably scattered randomly over the disk, the latter process being slower.

I come from a Sybase background where “a database” consists of “data” and “log” files. The data files correspond with the ibdataX or .ibd files and the log files correspond with the “ib_log_fileX” files, the major difference being that (at least when I looked at this in Sybase 12.5) the log files were considered part of the database and once added you also could not reduce their size. Increasing was easy but then you couldn’t free up the space again.

So MySQL and Sybase are similar, but MySQL’s implementation keeps the ib_logfileX files independent from the database files. It struck me that it would be very useful if we could dynamically tailor the two settings that InnoDB provides: the innodb_log_file_size and the innodb_log_files_in_group.  Since MySQL already automatically creates missing files, typically done if you start MySQL for the first time, or if you remove the files after changing the innodb_log_file_size parameter, this sounds reasonably easy.  Make the innodb_log_files_in_group larger and then when you get past the last available file, you build a new one. If the number is decreased then you simply go back to the first one again, ignoring later files.  Changing the size of the files could be done too, but perhaps is harder.  In both cases you need to be sure this works when recovering from a crash, which is a critical moment.

A change like this would help people see whether changing the size improves performance, and certainly for systems where taking downtime is expensive and difficult, this would be a welcome move, as over time initial sizing guidelines may vary and a dynamic change would resolve the problem quickly.  I recently took down some old MySQL systems which had been running uninterrupted for just over a year. I’m sure I would have probably adjusted these settings earlier had I been able to, and that would have been very convenient.

MySQL 5.5.21 now released – looking forward to trying it out

February 21st, 2012

I see that MySQL 5.5.21 has just been released. This sounds interesting. I’m mainly running 5.5.16 which has been broadly stable, but I have been caught by a few important issues which 5.5.21 fixes according to the change list:

These together with a few earlier fixes after 5.5.16 are certainly interesting to me, so I’m eager to try out 5.5.21 and see how it fairs.

Better Controlling MySQL Memory Usage

January 25th, 2012

MySQL, like a lot of other software, has many knobs you can tweak. Most of these knobs may affect behaviour, but more importantly most affect the memory usage of the server, so getting these settings right is very important.

Most of MySQL’s memory is really used just as a cache, in one form or another, information that otherwise is on disk. So ensuring you have as large a cache as possible is important. However, making these memory sizes too large will trigger the server to start swapping and possibly can cause it to crash or cause the kernel to kill the process when it runs out of memory.  So that’s something we want to avoid.

Certain settings affect memory allocation on a per connection/thread basis, being bounded by thread_cache_size and max_connections.  If you configure for the worst behaviour (max_connections) you may end up not actually using all the memory you have available, memory which normally could be used for other purposes.

Recently a server I managed was configured incorrectly with a large sort_buffer_size (4M to 256M) and larger read_buffer_size (4M to 20M).  The change in configuration on first glance looks quite innocent, and not noticing that these are per-connection settings this got rolled out. max_connections on this server was set to 1000, while normally there were ~40 connections of which only a few were active. The mysqld memory footprint on startup looked fine. In fact under normal usage it also worked fine. However spiky load suddenly changed this nice behaviour: as configured mysqld ramped up the thread count and hence memory usage, resulting in swapping and finally server death…

The fault of course was mine for not noticing, but this has been an issue with MySQL forever.  Most other RDBMSes manage memory slightly differently: you define how much memory you want to use (maximum), this is optinally locked into memory, and further requests for different buffers are taken from this global pool.  That is much safer, avoids unexpected swapping, or memory over-allocation, but does raise the question of what happens when a memory request can not be honoured.

Initially I would expect two different behaviours: either

  1. kill a thread whose memory can not be allocated, or
  2. wait a certain time to allocate that memory, after which it gets killed anyway.

Option (2) is probably saner, and possibly some sort of deadlock detection can kick if in all threads are waiting for memory, perhaps killing the younger thread, or thread which has done least work first. Possibly there are other better ways of doing this?

I can imagine that changing MySQL’s current behaviour to do something like this could be quite hard, especially as ideally the engines would also use the same memory management mechanisms, but I see this as being a good thing and would make MySQL more robust, especially under load, which is after all what counts.  Of course this will not happen in today’s 5.5 GA version, or tomorrow’s 5.6 version which is probably likely to appear some time this year. That’s a major change. It would be nice if Oracle look at this for 5.7 as a way of ensuring that when resource usage does come under pressure MySQL does not go heads up, but attempts to use the allocated resources as best as possible.

In the meantime what would help would be:

  • better documentation so we can see clearly how all mysql memory is allocated. There are several web pages commenting ways to calculate this, but certainly no definitive guide.
  • The InnoDB engine’s documentation talks about memory usage and most people think that the innodb_buffer_pool_size is the main setting. yet read further and there’s talk of an overhead of perhaps 1/8th. I have recently been playing with innodb_buffer_pool_instances settings > 1 (using values in the range of 20-40) and am inclined to think that this increases that overhead somewhat more, yet there’s no documentation on this and whether my guess is right or not. Please InnoDB developers improve your documentation, if only to prove me wrong.
  • Ideally some tools to tell you if you server is possibly misconfigured. Coming from a Sybase environment I’d be tempted to suggest a stored procedure in the mysql database which can tell you total memory usage and how it is broken down as doing this with a single SELECT is going to be tricky.

Then once that is done consider adding some extra variable to enable total memory usage to be controlled. I made a feature request for this at http://bugs.mysql.com/?id=64108. If you think this feature might interest you please let Oracle know.

Why online disk storage is not the same as using tapes

January 10th, 2012

A couple of years ago I posted about an unlucky encounter with my Thecus N5200 Pro. On the 1st January at 06:15 I had a New Year’s present. A disk died on me. The Thecus duly notified me by starting to beep.

I still have my RAID-6 setup so there was no real problem. I have 2 redundant disks. So I went and bought a replacement and fitted it and the Thecus duly started rebuilding the RAID array with an estimated time to finish of about 1 day (5x1TB disks, so ~3TB array).

Disk prices have increased significantly since I originally bought my array but buying a single disk was not a problem.  During the time that the array was rebuilding or shortly afterwards however a second disk failed just as it had 2 years ago when I ended up losing my array as the RAID-5 setup did not have any further disks. This time however there was no problem. RAID-6 saved me and the photos and backup information I had on my Thecus so I was happy and I ended up running out and buying another disk. This one triggered the array rebuild and completed successfully.

That lead me to think. A lot of people push the move to external disks as a backup alternative. Certainly this gives you online information and generally works well. I am using this device at home to keep my photos and other files and also to do a backup of my main machine to a separate location.  However, what strikes me as being clear: replacing disks is expensive. Had I been using tapes for backups throwing away the 2 tapes and buying new ones would have been much cheaper than buying 2 new hard disks for my array.  Of course a disk array does not provide you a site backup and also does not provide you a convenient way to store multiple backups over time. For that it is not yet cost effective.

So after this experience I am tempted to look and see if there are any small tape devices which I could attach to my PC and use that to give me these extra facilities. The hardware no doubt exists but from previous searches it is expensive and out of the price range of the average home user. That is a shame.  With a lot of people now dismissing tape devices as old fashioned failure of drives in a small array like my 5-disk array may turn out to be pretty expensive (in money) or pretty costly (lost data).  RAID-1 or RAID-5 sound great but can you afford for a second disk to fail while the array is rebuilding after the first disk has failed? If you can not then arrays of this type may not be for you and may lead you to a false sense of security.

The title of my original post was Unlucky Encounter… So have I just been unlucky (again) or is this sort of problem something which has happened to more people?

The Problems of Managing MySQL’s Configuration

November 11th, 2011

I want to keep a record of the configuration of the MySQL servers I manage. The configuration of some servers differs from others and over time the configuration may vary, partly as a result of upgrades in the mysql version or the use of the particular mysql instance, so tracking this is important.

Configuration items in MySQL can be thought of in 2 separate parts: the static configuration files which determine the behaviour of the server when it starts up (my.cnf) and the running configuration of the server in question. The latter information is usually obtained by running SHOW GLOBAL VARIABLES and SHOW SLAVE STATUS if the server is a slave.

I’d also like to compare the 2 sets of configuration so I can see if a local change has been made to the running server which is not reflected in its configuration file. I might want to correct this, or at least be aware of it.

However, collecting and comparing this configuration information is not as easy to do completely as one would really like. There are various inconsistencies which make this task troublesome.

  • Getting the default configuration out of the system is hard: mysqld --help --verbose sort of shows you this information but you have to filter out the junk.
  • Use of dashes ('-') in variable names in my.cnf or command line parameter names vs '_' in SHOW GLOBAL VARIABLES.
  • 5.5.16 seems to try to ‘partially’ startup when it shouldn’t.  See bug#63187.
  • You can not query the default configuration information from the running server.
  • SHOW SLAVE STATUS uses mixed-case variable names vs SHOW GLOBAL VARIABLES which uses lower-case variable names.
  • SHOW GLOBAL VARIABLES uses lower-case variable names vs SELECT * FROM INFORMATION_SCHEMA.GLOBAL_VARIABLES which shows them in upper-case.
  • Some variables get their multiple values sorted (others don’t)
  • Conversion of sql_mode special values is not consistent or clear.
  • my.cnf may use letter multipliers to indicate size (K, M, G), show global variables uses actual numbers.
  • my.cnf uses 0/1 vs SHOW GLOBAL VARIABLES using OFF/ON for certain switches. In some cases just having the variable defined implies ‘= 1′.
  • Some directory paths provided without a trailing ‘/’ in my.cnf have the trailing ‘/’ added in SHOW GLOBAL VARIABLES.
  • Some variables can take relative paths (relative to datadir) whereas SHOW GLOBAL VARIABLES shows the full path.
  • Some variable names differ in my.cnf compared to the equivalent setting in SHOW GLOBAL VARIABLES.
  • Some variable names have changed with newer versions of MySQL yet in many cases no warning is given that the new (correct?) variable name should be used and the old name is deprecated.
  • Not all my.cnf settings, DEFAULT SETTINGS (234 values) are shown in SHOW GLOBAL VARIABLES (312 values) in 5.5.16, (update 2013-04-08: 294 values vs 420 values in 5.6.10).
  • Some settings are incomplete (log_bin does not provide the full name of the binlog files set in my.cnf).
  • etc…

So all of these issues means that if I want to look at the /etc/my.cnf files and compare that to the running configuration it’s pretty hard to do well.  That may not seem like such a big deal until you realise that many mysql servers may have an uptime over a year. In that time the configuration may have evolved and this may mean you need to bring down mysql to adjust its configuration or alternatively see the difference and accept it’s not such a big issue and can be corrected the next time the server is restarted.

I’ve brought this up with Oracle support but guess that many in the community are so used to this they think it’s normal. If you manage several servers then it really is helpful to be able to collect this information and also see if servers run with the expected settings. Not doing so may mean that performance is affected and no-one notices.

So I’d like to ask that Oracle consider addressing this issue and remove some of the aforementioned inconsistencies, and perhaps also providing something like a SHOW DEFAULT SETTINGS or INFORMATION_SCHEMA.DEFAULT_SETTINGS output.  If you are in a company which goes through a lot of change and that includes the servers that are being managed, it will be much easier to ensure the servers are configured properly and if they were to address that.

Note software from other places makes this really easy and that can be very handy. Postfix‘s postconf shows the current configuration, but when called with -n will only show the non-default options and postconf -d shows all default settings.  So this certainly can be done.

Update: 2012-02-21

It seemed useful to actually add a list of bug reports and related posts.

Bug reports:

  • Bug#19612: InnoDB files per table my.cnf option is read inconsistently
  • Bug#55288: Use of the underscore or hyphen for variables and options is confusing
  • Bug#55705: configuration tool for MySQL server options
  • Bug#63187: mysqld –help –verbose should not try to startup but only provide help
  • Bug#64403: MySQL does not warn of deprecated variables on startup
  • Bug#64474: No clean way to get compiled-in settings from mysqld for a storage engine

Related posts:

Please advise me of other missing references.