Merlin is MySQL’s Enterprise Monitor, a system designed to monitor a number of MySQL database servers and provide useful feedback to the DBAs as to how they are running and what things may require attention.
It provides various things:
- graphs of key database and server performance indicators: load average, cpu usage and mysql specific information such as number of database connections, replication delays, buffer usage statistics, etc…
- a set of advisors which are intended to tell you if things are not configured properly and what to do to correct the problem.
- monitoring of servers which are linked together through replication
- the latest version of merlin (v2.0, which has been available since the beginning of the year) additionally adds via mysql proxy the ability to see all the queries which are passing through the database and then to group similar queries together so that you can work out which are the most expensive using different criteria.
This final information is invaluable. MySQL has been lacking what is available in other more expensive databases such as Oracle. Merlin gives a much greater insight into what is going on in the database and also gives the application developers feedback as to which queries may need attention or are not being used the way they originally intended and are thus generating more load on the database than expected.
I’ve been using merlin for about a year now and the agent which runs locally on the server where the mysqld process runs has been deployed to many servers.
I have also been using cacti and nagios to monitor the same servers, nagios provides the up to date monitoring and alerting of warning or critical conditions and cacti provides many graphs showing key performance indicators and how they change over time.
So how does Merlin do compared to these other tools?
Simply put from the database perspective merlin provides a lot more detail and MySQL knowledge than is possible with more generic tools. The graphing provides a much higher resolution than cacti which is very good for short-term analysis of loads, but we still face some issues looking at longer term graphical information because of the number of servers that merlin is being asked to monitor.
The advisors try to help the DBA by checking various things and telling him what needs adjusting. The idea is good but in my opinion this still needs some work. While certain things like security checks (no passwordless accounts) work well and ensure we do not miss obvious problems, other checks get triggered by thresholds which are too low at least in a production environment. As there are so many rules doing this by hand is not really feasible, so some sort of auto calibration is needed.
I think the other problem that merlin has with its advisors is that it’s not sure if it’s trying to monitor servers or just point out problems. There is an ability to send out email notifications (like nagios) and indeed if we experience a problem like the server going down we get notified. We don’t get notified however when the condition is resolved and that’s needed.
So the advisors need quite a lot more work to get right but things are going in the right direction.
We have been recently trying out the query analysis functionality and this has been invaluable. Up until now the only way to see what is happening on the server has been to look at the mysql slow logs, but this completely ignores any queries which take less than a second, so for fast queries which are run frequently this is no use. The other alternative is to use the mysql general log but to turn this option on or off in 5.0 requires a mysqld restart so is rather intrusive. We are not ready yet for the switch to 5.1 which solves this problem but even then you have to write the scripts to analyse these queries and merlin does this for you out of the box.
All these problems are overcome when using the proxy. While it may be necessary to adjust the grants, once this has been done the application transparently connects to the proxy and does it’s thing. In the meantime the proxy passes information via the agent which is running on the database server back to the merlin server and there you can see all the queries as they come in and are executed. Queries are normalised so similar queries with different parameters are grouped together and merlin’s web interface allows you to look at these queries combining the queries if needed from one or more database servers. The query information can be ordered by a variety of criteria such as total execution time, average execution time and execution count and there are additional filtering possibilities.
That’s extremely informative. Using this functionality on some of production servers has allowed us to see a few queries which were a lot more expensive than we had anticipated and adjust things to improve the performance. We have adjusted indexes, changed application coding to behave differently, but the net result was because we were better able to focus on the problem we were able to reduce CPU load on some servers to less than 50% of what it had been before. That is a significant advantage. It helped me as a DBA and the application developers too.
Using merlin in the development environment also allows you to test things before they are deployed and be more confident of how things will change when deployed. That is something that needs using on a more frequent basis.
So I am happy with merlin and it has enabled me to answer questions I would not have been able to answer without it. Merlin allows me to monitor many things from one single location and that is good. As the number of servers you need to manage increases having a central portal to watch these servers is certainly convenient.
The rough edges where Sun needs to spend some effort are:
- scalability on larger sites where perhaps some of the original design decisions are not as appropriate as might have been originally envisaged.
- the advisors need some work but have the potential to provide some really nice “real DBA” functionality while also providing the more traditional warning and error notifications seen in systems like nagios
However, in spite of these rough edges, if you are in the situation to try merlin I do not think you will be disappointed. It is likely to have something for you.