Query analysis plugin for 5.5!

A colleague pointed me at More fun with the MySQL Audit Plugin API which looks very interesting.  Analysis of the queries going on inside a msyqld has been something that has been wanted for some time.  Until now it’s only been possible with external addons such as MySQL Enterprise Monitor which do a good job. However, really the place for this functionality is inside the db server itself. If 5.5 m3 provides the hooks to do this that’s great news and while Anders’ first implementation may be simple, this can surely be extended in many ways.

The things I would like to see added to this plugin are the following many of which are safeguards to ensure you can use functionality on a system like this in production without bringing the server down.

  1. The size of the normalised query cache should be configurable, so that memory size is under control. Once full older queries should be thrown out.
  2. If the cache is not full but the query has not been updated in some time then the query should also be thrown out.
  3. Include columns such as time_first_seen, total_execution_time, shortest_execution_time, longest_execution_time, total_rows_returned, minimum_rows_returned, maximum_rows_returned, statements returning warnings, statements returning errors
  4. A big on/off switch that perhaps can be triggered by excessive load, thus saving us making the db server do a nose dive. Perhaps something as simple as a maximum number of queries to analyze in a second, when exceeded the analyse functionality is just skippped. This should be easy to implement though may produce a bottleneck around this counter.
  5. Include an optional filter so large queries may be ignored. I have seen multi-megabyte queries and if you analyze hundreds of these you need a lot of memory
  6. Include an optional filter to filter queries by the database being changed.
  7. Include an optional filter to filter queries by the database user making the queries
  8. A command to clear the cache data
  9. Some extra global counters so that we can see if we are dropping data: time cache last cleared, number of queries dropped for some of the reasons indicated above.

This sort of information would allow software such as Merlin to simply collect the data directly from the server and pass it back for longer term storage, so good for enterprise users, whilst the adhoc users would have a way to check their servers pretty quickly and see what they are doing and which are the heaviest queries running on their systems.

So thanks MySQL for adding the hooks, thanks Anders for starting this plugin and I hope that it will grow and perhaps do some of the things I outline above.

I’m looking forward to seeing a GA version of 5.5.

Published by

Simon J Mudd

Born in England, I now live in Spain, but spent a few years living in the Netherlands. I previously worked in banking (financial markets) both in IT and as a broker, but IT has always had a stronger influence. Now working at booking.com as a Senior Database Administrator. Other interests include photography, and travel. Simon is married, with two children and lives in Madrid.

2 thoughts on “Query analysis plugin for 5.5!”

  1. Simon!

    My code in that blog is really meant as an example. But most of what you ask for sure is possible, without too much effort. Using configuration variables for this or that is not too difficult, but this still lacks proper documentation, so I decided to skip that.
    Warnings, errors and row counters are simple enough. There is an error counter right now, but no warning counter. One thing missing that I know is an issue is the current database name. But all in all, some assembly is required, but no more so than your average Airfix scale model spitfire.


  2. Hi Anders. Thanks for replying.

    Your code may be an example but I use Merlin (MySQL enterprise manager) on a daily basis to collect data such as this and it’s extremely useful. However, until now it’s not been possible to get this information out of the server directly and tools such as Merlin can sometimes cause problems when run on production systems as they are not fully transparent to the clients and the analysis overhead may be sufficient to cause performance problems.

    Looking at the query and analysing it in the server itself means that you don’t have to have a second copy of the data somewhere else, nor is the query “proxied” through some other system before arriving at the mysqld server.

    If you do not plan on working on this further that is fine. I just hope that someone does as this functionality inside mysqld is something a lot of people would really like.

Leave a Reply