Do we need a MySQL Cookbook?

The blog title says it all: Do we need a MySQL Cookbook? I tend to think so.

This seems to be something that is missing with current MySQL documentation. There is lots of information available but finding the appropriate bit can be quite tedious and it often requires looking in multiple places.

A lot of other software has such books, but for some reason MySQL seems to be missing one.

A recent example comes from a “documentation feature request” I posted today: MySQL 5.6 provides a way to “move InnoDB tables” from one server to another. There are many reasons why you may want to do it, but the documentation is currently rather sparse. A simple “example recipe” for this would be good, as would an equivalent recipe for other engines where you can do this such as MyISAM. This is just an isolated example.

As far as I know there are no “best practices” on how to setup or manage a MySQL server, whether that come from Oracle, or the community. A common community view would probably be better as it makes “all vendors” work to a common goal, even if they compete in other areas. Look at the LSB as an example, adopted to some extent by all Linux distributions.  And again often there may be several ways to solve some problems so a Cookbook would be quite nice as it would suggest how to solve common tasks and let you use those as the basis for real implementations.

So what things could these Best Practices or Cook book contain? Things that come to mind are:

  • How to move tables from one instance to another (the example above)
  • How to clone an instance, whether to make a slave, or simply for copying to a development environment. It’s clear that here there are multiple solutions depending on factors such as database size, whether you are allowed to take the instance down or not, etc.
  • How to run multiple instances on the same server. This is something I used to do, but now try to avoid as often these instances interfere with each other. However, many people still do this so having common ways for setting up each instance and managing them (starting stopping, accessing each one) would be good. Here a common filesystem layout would be helpful too so that people do things a similar way but still have enough flexibility to adapt as needed.
  • How to run multiple versions of MySQL on the same server. Often could be useful in a development environment, or even in production, but at least on some versions of Linux the package managers make the assumption that only a single version of the “binaries” can be installed, thus preventing multiple versions running at once.  Sometimes this is needed and to have to work outside of the package managers is rather a pain. A common agreement on how to resolve that would be nice.
  • Backup and restore procedures, whether simple full database dumps, to using some of the other commercial tools.
    • That includes doing full and incremental backups, where this is possible, and also partial restores, or restores to a certain point in time.
  • Replication. Plenty to talk about here.
    • How to set it up from scratch
    • How to make more slaves (see cloning above)
    • Different replication features such as replication specific databases, the benefits or otherwise of row or statement based replication
    • Differences between the different versions, especially as MySQL 5.6 brings lots of new features such as minimal logging of changes to reduce binlog sizes
    • How to diagnose and recover from common replication issues
  • Monitoring
    • Again there are lots of options here, and many do it yourself ways, but also both free and commercial solutions. Often you first need to determine what you want to check, some checks may be technical ( mysql server running, the number of connections is not too close to max_connections, the number of running connections is not over a certain threshold, the SQL thread’s usage is not too high, replication is working, replication delay is not too high, etc). For many of these problems there are already checks that do the right thing. For many others you have to write your own.
    • All of the above is great but really people care about the application working, so on top of these check you need to add business checks etc.  Are the number of widgets we made in the last hour > 0?, …,
  • Grant management.  If we had a single server then you only set things up once. If you have more than one server then it gets more complicated. MySQL allows you to specify network ranges or hostnames from where a clients can connect, so all of a sudden managing this can be hard. I’m not sure if there’s a best way but it would certainly be interesting knowing how different people solve this problem, especially as the environment grows.  Replication can be used to help here and that works pretty well but you do have to be careful of gotchas with replicated grants if anything gets out of sync.  So this is certainly a topic of interest to most DBAs, whatever the chosen solution.
  • Setting up a new server. Simple things like choice of OS, MySQL version, MySQL Engine to store the data, expected profiles of the queries to the server, and then basic settings for a couple of these common setups.
  • Configuration. A huge topic but in many ways there are general things you should do, and probably that you should not do. A cookbook is probably not the place to go into huge detail about anything, which is where existing books may well be better but often helps get you going and provides indications on how to approach the problem and see that things look right.
  • Upgrade procedures for MySQL, a critical procedure for any DBA. We never stay on the same version, so we are always going to upgrade to a newer version. There are things to do in order to make that upgrade process work smoothly and this can get more complicated in a replicated environment, so talking about this and how it should be best done would be very good.  I have found the MySQL upgrade documentation to be sometimes rather incomplete or at least not written in a cookbook fashion of follow these steps to complete your upgrade, and hence you need to move all over the upgrade documentation trying to figure out if special steps are needed.  A minor version upgrade may not be an issue, but when doing a major version upgrade as many people will be doing shortly to MySQL 5.6 is something if you have important data in your database that you do not want to go wrong, so a cookbook style guide would be good for this.
  • Troubleshooting. I have had problems with the mysql servers I manage and some of these problems come up again and again, recognising them and having solutions to the common problems might be good, though recognising which problems are the most common is something that I guess can only be done by reading mail list archives, though Oracle/SkySQL/Percona support probably have a good idea here.

What other things am I missing for a cook book?

I’m sure this list could include many other things and maybe there’s a need to avoid too many details, referring to existing documentation where appropriate for details, but just provide the recipes and the explication of why, how etc would be good for a large number of people.  I for one have found from time to time that I could not find such a recipe to do a particular task, and I am sure there must be others like me with the same problem. Existing books are very good, but the focus seems to be slightly different and may go into a lot of detail when in some cases that detail may not be needed initially.

So would a cook book like this solve a need for you, and if so what might be missing from the suggestions I have made above?

CentOS 6.3 now installs on Mac Mini (mid-2011)

After lots of pain and trying to get CentOS 6 to work on my Mac Mini mid-2011 and failing, I finally find that CentOS 6.3 does indeed install.  Previous versions did not work for me, but I’m delighted to see that RedHat have done great work and the work they did that made it work on Fedora 17 has obviously been ported to CentOS 6.  The issue in the past had been the Mac would not boot the  install DVDs.

So all that was needed was for me to burn the (new?) CentOS-6.3-x86_64-netinstall-EFI.iso image, press ALT on boot and the DVD booted fine, and the install went along the same way as any normal CentOS/RHEL installation.  In fact I’ve been able to keep my original Fedora setup as the Volume Group I’d previously had setup on my disk was recognised so all that I needed to do was to create a new “root” logical volume and install on to that.

So thanks RedHat. You now make it much easier for me to migrate my “main” but older system on to this Mac.

On MySQL Memory Usage and Configuration

I saw a post on profiling memory usage today and this reminds me of several discussions I have had with different people.

Why would you want to profile the memory usage? Usually to see where memory is allocated and for what purposes, and usually you only care when memory usage is higher than expected. That is a DBA normally wants to use all available memory on a server for mysqld, whatever that size may be.

Configuration parameters may be inappropriate and need adjusting, so having a way to determine the range of memory usage based on those parameters would be most helpful.  However, the configuration parameters as a whole put no limit on memory used, so different workloads can quite easily lead to memory being insufficient on the OS, triggering swapping or perhaps the mysqld process to be killed by the kernel’s OOM. None of this is ideal.

So I would like to configure a cap on the maximum amount of memory that can be used and indeed that is what most RDBMSes do.  That is the allocation is made on startup, ensuring the memory is available, and any memory requests are taken out of this pool. This technique _should_ avoid you swapping (though I still see issues on NUMA architecture boxes). It also means that at some points in time the database server may not be able to honour memory requests and therefore must fail, wait until those needed resources are available, or it must free existing resources to make space for the requested new resource.

Doing this well is probably hard enough. Doing it in MySQL with different storage engines is probably harder still as each engine does things its own way.

Some mechanism like that would certainly help avoid strange behavioural issues when the query profiles change and this causes the server to behave in unexpected ways, such as swapping itself to death. The only solution right now seems to be to configure the server to ensure that these “edge cases” can not happen by using less memory, thus wasting valuable resources on the server.

Currently it is also pretty hard to see how memory is used. If you use InnoDB you know that the buffer pool takes up a sizeable portion of memory, but there is still a significant amount of other memory that’s used by each individual connection and this can vary significantly depending on the query profiles. It would be really nice to dynamically look inside MySQL and see how memory is used, and also see how adjusting different parameters may adjust this.

Of course none of this is in MySQL at the moment. I would certainly like to see first better information on current memory usage, I would guess some more tables in performance_schema, and at a later stage some way to cap memory usage to a desired level, and hope that when those limits get reached mysqld will figure out how to free up some resources in order to continue, wait or fail “gracefully”.

Until then we will have to continue playing the game of “configure mysqld”. See if it works and then ensure that queries to the system do not change enough to break the system and require us to go back and do it all again.

Note: there are several feature requests for this, so I do not think I am the only one in wishing for more visibility and configurability of the memory usage.

Update: 2013-01-17

Some feature requests that seem related to all this are shown below: