Archive for March, 2009

What’s missing in MySQL Replication?

Monday, March 30th, 2009

One of the things that has made MySQL grow a lot is the easy setup of replication on Internet sites. This has allowed them to grow from one database server to many reasonably easily. Configuration of replication is quite simple and replication itself is extremely fast. That makes it ideal for most people.

However at a point in time you begin to notice some of the weaknesses in the current design. Here are a few comments providing ideas on what I would like to see in MySQL replication and why they are features which are important to anyone who’s database requirements grow outside of a single database instance.

My list of current issues is as follows:

  1. MySQL only allows replication from 1 to N servers.
  2. Some important aspects of MySQL Configuration are not dynamic.
  3. Replication is part of the mysqld server, not a separate process.
  4. It’s not easy to configure table replication (config issue)
  5. It’s not easy to filter the replication commands
  6. Initialisation of the databases when you first setup replication is difficult
  7. Replication configuration information is not in the database

MySQL only allows replication from 1 to N servers.

MySQL replicates from 1 to N servers. As mentioned in a previous blog the current replication design is very simple.  Each mysqld server can get replication information from a single master and copies over the binlog from that master to the slave and replays part or all of the commands contained theerein. In MySQL up to 5.0 these are the SQL statements executed on the master, and in 5.1 and later it’s also possible to configure the master to store the rows that changed and replicate these changes on a row by row basis.

This design works well and is fast and it also scales out very well. However it does have issues. It creates a single point of failure in the master, which ends up being the weakest link, the source of all db changes. If the SELECT load is mainly distributed over the slaves then the problem is not too bad. However if the master fails all slaves will be broken.

There is however a second less obvious problem. In many companies there are various databases being used in the company.  Sometimes the usage of different database servers is different and also the information which is required. However of course often there is a set of data which may be common to one or more departments. Replication is a useful way of distributing this information between the different database instances. What’s not possible is to selectively choose which databases should be replicated from a location to a location, and here the MySQL 1:N fanout is not always ideal. You can’t replicate databases from 2 different servers to a common second location. That would be very handy. It’s also not possible to replicate from multiple locations into a single dedicated “summary” database, perhaps used for global reporting. A design like this requires a little bit of care in setting up but allows independent groups of servers to be able to run separately but their results to appear in a global master system. It also helps with failures: if a branch server goes down the other satelittes offices are not affected, nor is the global reporting of the data.

I’ve used this type of replication mechanism in my last job (using Sybase) and found it to be extremely convenient.  The added flexibility of being able to mix and match databases would give the DBA more freedom to be able to setup things for the business in a more intuitive manner.  Of course keeping it simple (where possible) is a goal to aim for but N:1 and selective replication of sources databases to destionation servers would be very useful.

Some important aspects of MySQL Configuration are not dynamic.

With any “enterprise” system you want all changes within reason to be able to be performed while the system is running.  This reduces downtime and thus keeps management happy. MySQL replication has a number of things which can not be configured dynamically.  Having to temporarily stop replication to perform these tasks is in itself not an issue. Having to stop the db server where the replication is running is. This has an impact on the business and affects the DBA and also system administrators who may need to take action if the database instance goes out of service.

So what can’t be done dynamically?

As far as I’m aware the configuration of which databases should be replicated, the configuration of the different specific tables to include or exclude from replication. While this information does not change frequently if it does the impact on provoking downtime is significant. If Sun made these configuration options dynamic then we would not need to restart the mysqld server and therefore business could continue as usual. Most applications don’t notice or mind if there’s a small delay in replication and they can determine if this is the case and take appropriate action if needed. Howerver if the databse server goes away that’s much more of an issue.

Replication is part of the mysqld server, not a separate process.

I mentioned this before in http://blog.wl0.org/?p=7. Having the replication process in the same mysqld process makes it impossible without design changes to support anything except 1:N replication. Even without changing existing
functionality if you could build this as a separate process then you would be able to build N:1 replication, even if that required ‘N’ separate processes.

Also replication changes and new functionality could be added independently of the mysqld version. That would allow the groups of people within Sun who manage the mysqld and those that look at replication specific issues to do their own thing and not be hindered by each other.  It would probably also allow improvements to be made to the replication process more quickly.

It’s not easy to configure table replication (config issue)

This is probably not considered an issue by people using mysql but that’s also probably because it’s not easily done there. Sometimes it’s convenient to have in a replicated database to have various types of tables: tables which are replication in the sense typical in MySQL, but many other tables which are local to the specific instance (perhaps instance configuration tables for
example).  In MySQL’s current design managing a long list (hundreds) of specific tables to be replicated or to be excluded from replication is not very easy. Also this list is not dynamic.  Thus improving the design here would help a lot.

It’s not easy to filter the replication commands

On some servers it might be nice to use it as a historic server containing all data, but on the master you might need to delete older data. Having a way of manipulating/filtering certain commands so that they can be eliminated or modified would allow this type of behaviour. For example filtering (and removing) commands (or the results of commands such as DELETE, UPDATE, TRUNCATE TABLE) could provide this sort of functionality.  Again if the replication process were separate from mysqld this would be much easier to implement.

Initialisation of the databases when you first setup replication is difficult

Creating a new slave can be done in many ways. However it’s quite hard with a busy server especially when making the first slave to do so without stopping the master. That’s especially true as normal mysql dumps don’t guarantee a consistent database is generated nor is it easy to work out exactly the point in replication to use a dump from. What would be nice would be to setup a database as a copy of the master but have  the mysql replication process fill the slave’s tables directly from the master with no extra help.  At a certain size this may not be practical but for smaller database sizes this avoids a lot of extra complexity.

Replication configuration information is not in the database

Various parts of the replication configuration are outside of the database proper.  The configuration is stored in separate text files (master.info relay-log.info) and not in the database itself. Where there are issues with MySQL’s mixed storage engines and thus how replication might work after recovery from a crash it would still be nicer if the data were stored in the database and thus hopefully the recovery after a crash would be consistent.

2012-02-20: Update

Quite a long time has passed and little has changed. One thing I did not add in my original post was a list of mysql bug reports which request the type of functionality mentioned above. Let me know of others I might have missed.

  • http://bugs.mysql.com/35611 please make all slave replication options dynamic
  • http://bugs.mysql.com/57560 please make all binary log options dynamically settable
  • http://bugs.mysql.com/62825 Option to replicate at the database level instead of the server level

Reflections on MySQL Enterprise Monitor

Tuesday, March 3rd, 2009

Merlin is MySQL’s Enterprise Monitor, a system designed to monitor a number of MySQL database servers and provide useful feedback to the DBAs as to how they are running and what things may require attention.

It provides various things:

  • graphs of key database and server performance indicators: load average, cpu usage and mysql specific information such as number of database connections, replication delays, buffer usage statistics, etc…
  • a set of advisors which are intended to tell you if things are not configured properly and what to do to correct the problem.
  • monitoring of servers which are linked together through replication
  • the latest version of merlin (v2.0, which has been available since the beginning of the year) additionally adds via mysql proxy the ability to see all the queries which are passing through the database and then to group similar queries together so that you can work out which are the most expensive using different criteria.

This final information is invaluable. MySQL has been lacking what is available in other more expensive databases such as Oracle. Merlin gives a much greater insight into what is going on in the database and also gives the application developers feedback as to which queries may need attention or are not being used the way they originally intended and are thus generating more load on the database than expected.

I’ve been using merlin for about a year now and the agent which runs locally on the server where the mysqld process runs has been deployed to many servers.

I have also been using cacti and nagios to monitor the same servers, nagios provides the up to date monitoring and alerting of warning or critical conditions and cacti provides many graphs showing key performance indicators and how they change over time.

So how does Merlin do compared to these other tools?

Simply put from the database perspective merlin provides a lot more detail and MySQL knowledge than is possible with more generic tools. The graphing provides a much higher resolution than cacti which is very good for short-term analysis of loads, but we still face some issues looking at longer term graphical information because of the number of servers that merlin is being asked to monitor.

The advisors try to help the DBA by checking various things and telling him what needs adjusting. The idea is good but in my opinion this still needs some work. While certain things like security checks (no passwordless accounts) work well and ensure we do not miss obvious problems, other checks get triggered by thresholds which are too low at least in a production environment. As there are so many rules doing this by hand is not really feasible, so some sort of auto calibration is needed.

I think the other problem that merlin has with its advisors is that it’s not sure if it’s trying to monitor servers or just point out problems. There is an ability to send out email notifications (like nagios) and indeed if we experience a problem like the server going down we get notified. We don’t get notified however when the condition is resolved and that’s needed.

So the advisors need quite a lot more work to get right but things are going in the right direction.

We have been recently trying out the query analysis functionality and this has been invaluable. Up until now the only way to see what is happening on the server has been to look at the mysql slow logs, but this completely ignores any queries which take less than a second, so for fast queries which are run frequently this is no use. The other alternative is to use the mysql general log but to turn this option on or off in 5.0 requires a mysqld restart so is rather intrusive. We are not ready yet for the switch to 5.1 which solves this problem but even then you have to write the scripts to analyse these queries and merlin does this for you out of the box.

All these problems are overcome when using the proxy. While it may be necessary to adjust the grants, once this has been done the application transparently connects to the proxy and does it’s thing. In the meantime the proxy passes information via the agent which is running on the database server back to the merlin server and there you can see all the queries as they come in and are executed. Queries are normalised so similar queries with different parameters are grouped together and merlin’s web interface allows you to look at these queries combining the queries if needed from one or more database servers. The  query information can be ordered by a variety of criteria such as total execution time, average execution time and execution count and there are additional filtering possibilities.

That’s extremely informative. Using this functionality on some of  production servers has allowed us to see a few queries which were a lot more expensive than we had anticipated and adjust things to improve the performance.  We have adjusted indexes, changed application coding to behave differently, but the net result was because we were better able to focus on the problem we were able to reduce CPU load on some servers to less than 50% of what it had been before. That is a significant advantage. It helped me as a DBA and the application developers too.

Using merlin in the development environment also allows you to test things before they are deployed and be more confident of how things will change when deployed. That is something that needs using on a more frequent basis.

So I am happy with merlin and it has enabled me to answer questions I would not have been able to answer without it. Merlin allows me to monitor many things from one single location and that is good. As the number of servers you need to manage increases having a central portal to watch these servers is certainly convenient.

The rough edges where Sun needs to spend some effort are:

  • scalability on larger sites where perhaps some of the original design decisions are not as appropriate as might have been originally envisaged.
  • the advisors need some work but have the potential to provide some really nice “real DBA” functionality while also providing the more traditional warning and error notifications seen in systems like nagios

However, in spite of these rough edges, if you are in the situation to try merlin I do not think you will be disappointed. It is likely to have something for you.