MariaDB 10.1 Brings Compound Statements

October 19th, 2014

A very old post of mine in 2009, MySQL’s stored procedure language could be so much more Useful suggested that it would be nice if MySQL could be adapted to use compound statements directly from the command line in a similar way to the language used for stored procedures. I’ve just seen that this seems to be possible now in MariaDB 10.1. See the release notes.

I now need to look at this. So thanks, it looks like this feature request is now available.

Making MySQL Better More Quickly

September 13th, 2014

With the upcoming release of MySQL 5.7 I begin to see a problem which I think needs attention at least for 5.8 or whatever comes next.

  • The GA release cycle is too long, being about 2 years and that means 3 years between upgrades in a production environment
  • More people use MySQL and the data it holds becomes more important. So playing with development versions while possible becomes harder.  This is bad for Oracle as they do not get the feedback they need to adjust the development of new features and have to best guess the right choices.
  • Production DBAs do want new features and crave them if it makes our life easier, if performance improves, but we also have to live in an environment which is sufficiently stable.  This is a hard mixture of requirements to work with.
  • In larger environments the transition from one major version to another, even when automated can take time. If any gotcha comes along then it may interrupt that process and leave us with a mixed environment of old and new, or simply in the state of not being able to upgrade at all.  Usually that pause may not be long but even new minor versions of MySQL are not released that frequently so from getting an issue fixed to seeing it released and then upgrading all servers to this new version is again another round of upgrades.

I would like to see Oracle provide new features and make MySQL better. They are doing that and it is clear that since I have been using 5.0 professionally up to the current 5.7 a huge amount has changed. The product is much more stable and performs much better, but my workload has also increased so I am still looking for more features and an easier life. I am an optimist that is for sure.

One issue that I believe holds back earlier experimentation is that MySQL is not modular. Even the engines that you can use in it, if built as plugins, do not seem to be switchable from one minor version to another.

This leads to 2 issues:

  • any breakage or bug (and all software has bugs, that is inevitable) requires you when it is fixed to upgrade to a new version. That new version has changes in many different components. Sometimes that is fine but sometimes that may bring in new bugs which cause their own problems
  • potentially the developers of MySQL could replace a “GA module” with a more experimental version of that module which maybe has more features, could perform better but maybe breaks. Changing a single module is hopefully much safer than changing a full binary for a development version, and that should be much easier to do on spare machines. A module such as this would be something I could much more easily test than installing 5.7.4 on lots of machines.

However, the problem is that MySQL is not modular and that is where several people have explained to me my madness and how hard it is to achieve things like this. My current employer likes to push out changes in small chunks, look at the result of those small changes and then if they seem good, go ahead and do more. If something goes wrong, back it out and look elsewhere to do things. Doing the same on a database server not designed that way may well be hard, but making small changes along these lines would I think longer term help improve things and give the people that use a GA MySQL the opportunity to try out new ideas, give feedback quickly and allow things to evolve.

Inevitably when you start to build interfaces like this some interfaces need to change to allow to allow for a larger redesign of the innards of a system. That is fine, when it happens we’ll move over to that and a DEV version will have these new much improved features and we may have to wait longer for that.

What modules might I be talking about when I talk about modularising MySQL?  I’ll agree I do not know the code other than having glanced at it on several occasions but there are some quite clear functional parts to MySQL:

  • the engines have often been plugins, though now InnoDB is a bit of an exception. I still wonder if that is necessary whatever MySQL’s design.  However these plugins do not seem to have a completely clear interface with MySQL as I have seen plugins for example for something like Spider or TokuDB which work for a specific MySQL or MariaDB version. That just shows that whatever this interface is it is not designed to be stable and swappable between different MySQL minor versions.  Doing something to make that better would mean that people who build a new engine can build it once for a a major version and know that on binaries built the same way the files they produce should just plug in without issue unchanged. Me dreaming? Perhaps but no-one worries if I upgrade my db4 rpm from 4.7.25 to 4.7.29 that all the applications that use it will break: the expectation is clear: it should not make any difference at all. Why does something like this not work with MySQL engine code?
  • logging has been rather inconsistent for a long time. I think it may improve in 5.7, but however it’s built, build it as a module. If I want to replace that module with something new that stores all my log data in a Sybase or DB2 database MySQL should not care, assuming the module does the right thing and there are settings to configure this appropriately.  The point being also that if there is a bug in the logging, the bug can be fixed and the module replaced with a bug-free version, without necessarily requiring me to upgrade the whole server.
  • Replication is generally split into 2 parts: the writing to binlogs and the reading of those binlogs from a master, storing them locally and reloading the relay logs and processing them.
    • I have seen bugs in replication, mainly in the more complex SQL thread component where the same change could potentially apply. Swap out the module for a fixed one.
    • MySQL 5.6 was supposed to make life great with replication and we would not get stuck in a situation where a crashed server would come up, out of sync with its master, and because of that we would need to reclone the server again. Even when moving over to using the master_info_repository and relay_log_info_repository settings to TABLE you can have issues. The quick fix implemented by Oracle of relay_log_recovery = 1 sounds great. It is a quick, cheap and cheerful solution which works assuming you never have delayed slaves.  Different environments I maintain do not follow this pattern and I have servers with a deliberate multi-hour delay, which can be useful for recovering from issues. Also copying large databases between datacentres may take several days, triggering after starting the system a need to pull logs and process them for several days. A mistaken restart would lose all that data and require it to be downloaded again which is costly. So I have discussed with colleagues a theoretical improved behaviour of the I/O thread should MySQL crash but there is no way to test it on boxes I currently use. Making the I/O thread into a module would make it much easier to try out different ideas on GA boxes to show whether these ideas are really workable or not.
  • The query parser and optimiser in MySQL is supposed to be a horrendous beast that everyone must keep clear of.  Improvements are happening and posts like this are an indication of progress. My understanding is that this beast is spread all over the server code and thus hard to untangle but certainly from a theoretical point of view doing so would allow alternative optimisers to be usable/pluggable, and for example different optimisers might be better at handling workloads such as batch based workloads with sub queries and such which MySQL is known not to handle well, but which for certain workloads could potentially make a great deal of difference to us all.  The MySQL of 5.0 is quite different from the MySQL of today and sharding is the norm, but that requires help from the app to do all the dirty work. Other options are to use something like Vitess, ScaleBase, or Spider, or some built-in new module which knows about this type of thing better and can do this sort of stuff transparently to the application. MySQL Fabric tries to do this at the application level and that’s fine, but it adds much more complexity for the application developers who probably should not really have to worry (too much) about this type of detail.  So solving the problem is not the issue here, it’s providing hooks to let others try, or simply to swap out version 1 with version 10, and see if version 10 is better and faster, with everything else unchanged.
  • The handling of memory in MySQL has always been interesting to us all. Each engine has traditionally managed the memory it needs itself and there is no concept of sharing, or memory pressure, all of which can lead to sudden memory explosions due to a changing workload which may kill mysqld (Linux OOM) or trigger swapping (database servers should never swap…). I have seen in 5.7 that there is now some memory instrumentation and this at least allows looking to see where memory is used. The next step would be to use the same memory management routines, and finally perhaps to add this concept of memory pressure allowing a large query if needed to page out or reduce the size of the innodb buffer pool while it is running, or the heavy use of some MyISAM or Aria tables could do the same.  Doing that is hard, but we are no longer using a MySQL “toy” database. Many large billion $ companies depend on MySQL so this sort of functionality would be most welcome there I am sure.  Changes in this area would certainly need to be done cautiously but I can envisage swapping out the default 5.8 memory manager for a “new feature” 5.9 version with all the “if it breaks you keep the bits” warnings attached, allowing us to see if indeed problematic memory behaviour is resolved by this new module.
  • The event scheduler is in theory a small and tiny component which does it’s thing.  An early version of 5.5 had some bugs and I had to wait a long time to upgrade the server just to fix this pesky event_scheduler module which all it does is send out heartbeat changes used for measuring replication delay.  Had this been a module I could have installed a fixed version and not had to use a work around for several months.

I am sure there are lots of other components of MySQL which could receive the same treatment.

Making these sort of changes is of course a huge project and most managers do not see the gain of this, certainly not short term.  However, if care is taken and as different subsystems are modified there is an opportunity for making progress and allowing the sort of experimentation I describe.  Also, and while Oracle may not see it this way, having a clearer interface and more modular framework would allow others to perhaps try different things, and replace a module with their own.  Oracle do seem to be putting a lot of resources into MySQL and that is good, but they do not have infinite resources and they can not solve specialised or every need that we might see. Making it easier, for those who can, to use this hypothetical modular framework, provides an opportunity for some things to be done which can not be done now.  Add a bounty feature and let people pay for that and where something is modularised it will be much easier for them to try to solve problems that may come up. In any case, later testing will be easier if these interfaces exist.

This is the way I would like to see MySQL improve, notice I do not actually talk about functional improvements, but how to make it potentially easier to experiment and test these new features. This sort of design change would allow those of us that need new features now to test and perhaps include them in our GA versions. Maybe then the definition of GA will become rather vague if I am using 5.7.10 + innodb 5.8.1 + io_thread_5.8.3 + sql_thread_5.8.6 + event_scheduler….. Support will probably hate the suggestion I have just made as it would potentially make their life more challenging, but then again I do not see most people playing this game. It is meant for those of us who need it, and if not needed at all bug fixing specific issues should be much easier than now, where you need to do a full new test on a new version to make sure you do not catch another set of new bugs.

If you have got to the end of this thanks for reading. I need to learn to write less but I do believe that the reasoning I make above makes a lot of sense. This can only be done with small changes and with people seeing the idea and trying it out, and at least initially doing it on parts of the system which are easy to do. If they work further progress can be made.

Oracle and MariaDB both want feedback and ideas of where we want MySQL / MariaDB to go.  Independently of some of the technical aspects of new features and improvements this is my 2 cents of one thing I would like to see and why.

Does it make sense?

MariaDB 10.0 upgrade goes smoothly

August 7th, 2014

I have been meaning to update some systems to MariaDB 10.0 and finally had a bit of time to get around to that.  The documentation of specifics of what’s needed to go from MariaDB 5.5 to 10.0 can be found here and while it’s not very long it seems there’s little to actually do.

Having already upgraded some servers from MySQL 5.5 to 5.6 the process and appropriate configuration changes were very similar so all in all a rather non event.

One thing which is always a concern if systems can not be down for long is the time to do the upgrade. While you see many blog posts talking about taking a backup via mysqldump and then loading it all back this is not really an option on many systems I manage and a replacement of binaries, adjustment of /etc/my.cnf  and restart of the server with the new binaries followed by running mysql_upgrade is what I usually do.  That usually works fine.

One server I had to upgrade had quite a few files (15,000) and the database occupied 2 TB. The run time of mysql_upgrade on such a system takes a couple of hours of checking tables after which the actual system changes take almost no time at all. So that is something to be aware of if your dataset is similar.

It seems I was confused by Colin’s post on performance_schema being disabled in MariaDB 10.0.12 and later. It seems that actually it’s disabled on startup, so can be easily configured if desired by setting performance_schema = 1 in /etc/my.cnf prior to starting mysqld. I had thought that it was not compiled into the binaries at all, which is what’s been done in WebScaleSQL. That is not the case.

There’s a lot of talk about performance_schema overhead, even recently from colleagues, so this is a subject which needs looking at in more detail and if there is indeed an unacceptable overhead then that needs looking at. I’m sure Oracle or MariaDB would appreciate reports of specific issues as otherwise there’s just too much fud out there.

Anyway the MariaDB 10.0 servers I upgraded did have p_s and my configuration enabled it. That’s nice as now I can see where some of the load and performance points are, and had thought that would not be possible. I also tried Mark Leith’s mysql_sys and was not sure if it would work in MariaDB 10.0 but a quick look seems to indicate it does which is helpful.

The views in this sys schema are very useful but some care is needed when using them as several tables are joined in performance_schema and if the number of rows involved is high, and given performance_schema has no indexes, this can make queries be very costly, taking minutes to run. That’s not been mentioned, so take a little care. Under normal usage this does not seem to be an issue, it really depends on different use cases. Longer term I think that P_S will need indexes, even if these are only “memory tables” …

The upgrade went smoothly, so now it’s time for me to check the new MariaDB 10.0 features and see how they fare.

 

MySQL Slave Scaling and more

July 16th, 2014

Jean-François talks about binlog servers. Take a look here: http://blog.booking.com/mysql_slave_scaling_and_more.html

Time to get some 128-bit types into MySQL?

June 30th, 2014

I think that getting 128-bit types into MySQL would be good. There are a few use cases for this and right now we have to work around them. That should not be necessary.  While not essential they would make things easier.

The headline is easy to understand, but is this really needed?

First we need to look to see where this might be used. I can think of three different 128-bit types which are missing at the moment:

  • IPv6 addresses
  • uuid values
  • a bigger value than (signed) bigint [64-bit numbers]

IPv6 Addresses

IPv6 addresses are 128-bit numbers, and having a native way to store them would be really helpful. Given this also includes an IPv4 representation then for those people who store IP addresses (client connections and other things) such a native type would be much better than the typical unsigned int or binary(4) which you might be using now. Is this an IPv4 address? Well it might be, but it also might not be.  The same applies to IPv6, and having a real IPv6 type makes this knowledge more explicit.

MySQL already provides support routines for IPv4 (even if the type does not exist) such as INET_ATON(), INET_NTOA() so a similar set of routines would be needed to support this type, converting between their text and numeric representation and also for converting between IPv4 and IPv6.

UUID Values

MySQL itself uses UUID values in 5.6 and above as the server_uuid, but it’s stored or seems to be as a string. Other software (MEM is a good example) also uses UUID values in various places.

Have a look on search engines for MySQL and UUID and you see lots of questions on how to best store these values in MySQL. So there is already a demand for this, and no good answers as far as I can see.

One common concern I have currently when storing such values as binary(16) is that the values are hard to visualise, especially if used as a primary key, and also from the DBA’s point of view who may want to “manually” access or modify data  it is not possible to do something similar to SELECT name FROM servers WHERE uuid = ‘cd2180ae-9b94-11e2-b407-e83935c12500′, as this just does not work. Casting could make this work magically but right now it’s much harder than it should be.  There is not a single UUID format but the basics are the same and if we had a uuid format any supporting routines (which would be needed) would be able to convert as needed.

Signed or unsigned integers

Yes, the (signed or unsigned) bigint type gives us 64-bits and that allows for huge numbers but one size bigger matches the use cases above, so it’s good to be able to convert between them depending on the usage.  That is if we’re going to have IPv6 and UUID type values, it makes sense to allow an integer equivalent representation and sometimes this might be needed when stripping out parts of a uuid, or parts of an IPv6 address.  The name of this type should be something a little better than we’ve seen before so hugeint (unsigned) would not be what I would suggest. Something as simple as int128 (unsigned) would be much easier to understand.

Conversion routines

Each of the three types above need routines to support their “native” usage and probably converting from / to numeric or text representations of the value.  Given the three types have the same size then it may also be useful to convert from one format to another. The actual content would not change, just it’s representation. Included with this would be a BINARY(16) so that people who might have had to use other MySQL times to represent these values have an easy way to convert more explicitly to them and if for any reason a conversion back is needed this is also possible.

ALTER TABLE should be aware of these equivalents too so if I have a table defined with a BINARY(16) I can convert it to an IPv6 address/type as a no-op operation (definition only change), in a similar way as can be done with some other conversions (ENUM being a common type that changes but if you add a new value there’s no need to check the table for existing values as the old definition was a subset of the new one).

No incompatible changes in minor versions please

A change such as this can not reasonably be added as a minor version change as if we would break many things.  Minor versions should really, really only included bug fixes, or performance improvements, and if a new feature really has to be added by default it must be disabled (for compatibility) and enabled with some sort of special option. Given there’s no agreed way to do this and it is likely to cause all sorts of issues, just do not do it.

That means that a feature such as this can only be added in a new version such as MySQL 5.7 or MariaDB 10.1 both of which are DEV versions, and so allowed to change in any way their authors deem reasonable. I have seen no indication of 5.7 including this functionality and given the time that 5.7 has been about I am inclined to think that an extra change such as this is unlikely to make it there. So MySQL 5.8 then? MariaDB 10.1 development has not been ongoing for that long so maybe such a feature might be considered there.

In the end we do need these new features and long lead times to make them available is a considerable source of frustration for those of us who have a number of systems to upgrade.  One thing is a new version going GA, but it’s something else to have all systems upgraded to use that version and thus make it available to developers.

Whatever happens it would be really helpful if the different “MySQL vendors” talk to each other, if they agree that this is a sensible path to take. Having various different interpretations of how these new types should be stored, converted and which associated functions etc are needed would be a user or developer’s nightmare. I understand there is competition, but for something like this it is really important to get it right.  The first implementor of such a feature would potentially have an advantage over the others but I would expect usage of this type of data types to be quite popular so agreeing generally on what to do should not be that hard and avoids the different forks from drifting off further apart, something which I think is bad for everyone concerned.

Conclusion

Some people I have spoken share the opinion that having such a set of 128-bit types would be good. It is something else of course to implement that.  For those looking for new features to develop in MySQL this is one which in theory is not absolutely necessary but which I think would not only be popular but would be used.  In the end MySQL is there to store data, and make it easy to retrieve and it seems clear to me that this type of data is one such usage which while it can be handled differently would really welcome “native” support. I hope that this will happen sometime soon.

Update 2014-07-03

MariaDB seems to have some support on its way for this. Referenced on maria-developers on 1st July, details can be found here:

If a plugin type is available for IPv4 that might be good as well.

This looks like work in progress and there’s no mention of a 128bit (unsigned) int, or how to convert between different values, but this looks like a good start. In fact if it’s possible to make these types available via a plugin interface this does seem to add the possibility of adding new special types even once MariaDB is working, so it makes it easier to expand functionality later.

In terms of routines that probably should be available in MySQL to support some of these types the following stand out:

  • INET_PTON() and INET_NTOP() to supplement the existing INET_ATON() and INET_NTOA() functions.
  • GETADDRINFO() and GETNAMEINFO() to convert between IPv4 or IPv6 addresses and names. existing INET_ATON() and INET_NTOA() functions.
  • Something like  UUID_LONG() to generate a 128-bit numeric equivalent of  UUID(), and functions to convert a text-based uuid into a number and a something to convert back again, STRING_TO_UUID() and UUID_TO_STRING() unless there already exists some standard function name for these tasks.

I think all of these look like useful routines to go with the types above. I’ll add more as I think of them.

Update 2014-07-11

I also see this very old bug referenced in the mysql bug list: http://bugs.mysql.com/bug.php?id=15940

webscalesql-5.6.17.69 RPMs available for CentOS 6

May 12th, 2014

A new commit b955fd46ee60b134c6935badb43eb838872cfbbf was pushed out to the webscalesql-5.6 so I’ve built some updated RPMs using my webscalesql-rpm scripts.  The new binaries if you want to try them can be found at http://ftp.wl0.org/webscalesql/.

The rpms are:

Again these packages are work in progress, but feedback is welcome.

MMUG7: Madrid MySQL Users Group meeting to take place on 24th April 2014

April 22nd, 2014

Madrid MySQL Users Group will have its next meeting on the 24th of April. Details can be found on the group’s Meetup page.

We plan to talk about WebScaleSQL and I will give a short presentation on how to build WebScaleSQL RPMs on CentOS 6.  The meeting will be in Spanish.

We’ve changed the place that we’ll be holding the meeting. See the Meetup URL for details. Looking forward to seeing you there.

La próxima reunión de Madrid MySQL Users Group tendrá lugar el jueves 24 de abril. Se puede encontrar más detalles en la página del grupo.  Hablaremos sobre WebScaleSQL y ofreceré una breve presentación sobre como construir RPMS de WebScaleSQL para CentOS 6.  La reunión será en español.

Hemos cambiado el lugar donde se ubicará la reunión. Mirar la URL del Meetup para más detalles. Esperamos veros allí.

WebScaleSQL RPMs for CentOS 6

April 1st, 2014

Looks like this post was rather unclear. See the bottom for how to build the rpms quickly.

WebScaleSQL was announced last week. This looks like a good thing for MySQL as it provides a buildable version of MySQL which includes multiple patches from Facebook, Google, LinkedIn, and Twitter needed by large users of MySQL, patches which have not been incorporated into the upstream source tree.  Making this more visible will possibly encourage more of these patches to be brought into the code sooner.

The source is provided as a git repo at https://github.com/webscalesql/webscalesql-5.6 and as detailed at http://webscalesql.org/faq.html the documentation says there is currently no intention to provide binaries.

Instructions on building the binaries and the build requirements for WebScaleSql can be found at http://webscalesql.org/faq.html and do not look too hard. However, I prefer to install my software as rpms as this makes upgrading or removing it later much easier.

With that in mind I thought I’d try and build some webscalesql rpms.

As I’m currently using MySQL-5.6 rpms, downloaded from http://dev.mysql.com/downloads/mysql/, I wanted to build WebScaleSQL rpms which were compatible with these.  I’m aware of the Oracle-built “community” rpms which are downloadable directory from their yum repo (https://dev.mysql.com/downloads/repo/) and will probably use these when upgrading to MySQL 5.7 but moving over to that now requires changing internal infrastructure and is currently not worth the effort.

In order to build the WebScaleSQL rpms I did the following:

  • download the latest source rpm, MySQL-5.6-17-1.el6.src.rpm from http://dev.mysql.com/downloads/file.php?id=451516
  • extract the spec file: rpm -ivh MySQL-5.6.17-1.el6.src.rpm and look in the directory specified by rpm –eval ‘%{_specdir}’ for the spec file (mysql.spec)
  • install the devtools package in order to use GCC 4.7:

  •  clone the webscalesql.git repo to any directory, let’s call it  $WSS_HOME
  • create a webscale-5.6.tar.gz tar ball and put it in the RPM SRC_DIR:
  • create a minimally changed webscalesql.spec file, based on mysql.spec, to include the new path for the compiler toolchain
  • build the rpm by cd’ing into the directory with the webscalesql.spec file and running:

Warning: no explict check is made for the devtools chain to be installed and compilation will break if you try to use the native gcc compiler.

  •  this built the following rpms:

These rpms should work for CentOS 6, RHEL 6 and other equivalent distributions.  I have not actually tried to use any of the packages except the webscalesql-server.

I have had very little time so far to play with this, but did replace the MySQL-server package with webscalesql-server on a development server and let it run for a few hours.

One thing I did notice is that the performance_schema* settings I had in /etc/my.cnf were not recognised by webscalesql-server and had to be commented out. That said performance_schema still seemed to be there.

I need to check further but guess that this may be due to differences between MySQL and webscalesql or potentially something I have not done correctly when building.

Other than that the server replicated fine and I saw no issues.

This has given me some basical rpms for testing.  I have not tested the package on anything other than CentOS 6 and it is likely that other changes are needed. I probably need to do a few other things like:

  • Clean up the package further maybe adjusting copyrights or other messages about the packages.
  • Obsolete the installed MySQL-server, … rpms so I can just do rpm -Uvh webscalesql-server …. rather than remove the MySQL-server package first.
  • Add a bit of scripting to incorporate the date of the latest webscalesql commit into the version/release settings in the spec file. This avoids having to manually change the different values and as updates happen a rerun of the build script should just build a new package transparently.

I have not yet had time to look at the patches that have been applied to WebScaleSQL. It would certainly be nice to have some sort of list of functional changes (such as the performance_schema difference I noted earlier, assuming this is not a build error) of WebScaleSQL compared to the upstream source and any new configuration settings. Perhaps that will happen later?

At least for those of you who want to run a quick test of the binaries, or look at my spec file, you can find them on my website: http://ftp.wl0.org/webscalesql/.  No guarantees of any kind as you can imagine but feedback and improvements to the current spec file or build procedure would be most welcome.

2014-04-03 Update

See: https://github.com/sjmudd/webscalesql-rpm/ which I have created as a quick helper script to do the build. It still probably needs quite a bit of work but avoids copying instructions and doing stuff by hand.

2014-04-22 Update 2

Basically all you need to do is:

MySQL 5.6 GTIDs: Evaluation and Online Migration

March 28th, 2014

A colleague and I have been looking at GTID on MySQL recently and you may be interested in the blog post that results from that. You can see it here. http://blog.booking.com/mysql-5.6-gtids-evaluation-and-online-migration.html.

 

MMUG6: Madrid MySQL Users Group meeting to take place on 20th March 2014

March 1st, 2014

Madrid MySQL Users Group will have its next meeting on 20th March. Details can be found on the group’s Meetup page.

I will be giving a presentation on MySQL replication hopefully aimed at all levels, but covering some details relevant to larger setups. The meeting will be in Spanish.

Look forward to seeing you there.

La próxima reunión de Madrid MySQL Users Group tendrá lugar el jueves 20 de marzo. Se puede encontrar más detalles en la página del grupo.  Ofreceré una presentación sobre replicación de MySQL dirigido a gente de todos los niveles, pero incluirá información relevante a entornos más grandes.  La presentación será en español.

Espero veros allí.

MySQL 5.6 GA one year – What is next?

February 9th, 2014

MySQL 5.6 has been GA for just over a year now. See MySQL 5.6.10 Release Notes.  Congratulations on your birthday! That is quite a long time. I was using it earlier in production because it worked and could do things that 5.5 could not do, but earlier versions were to use at your own risk, and indeed if prodded incorrectly would fall on the floor. That is fair enough because they were work in progress, yet if you poked them the right way they did a very good job.  Those dev versions have been long since upgraded which is good so they do not need quite as much care and attention.

So from where I see 5.6 it works very well. One big change that has made a large difference but which I think a lot of people may not really understand or use is the performance_schema database. That provides a huge amount of great and detailed information about what the server is doing, where it is busy and why, and allows you to know these things rather than twiddle and guess which in the past we have been forced to do. So a big thanks for that. It does require quite a lot of study and Mark Leith’s dbahelper goes a long way to making PS useful to the average DBA.  A recent issue came up and I was asked (by Oracle support) to use it to find out the causes: a few SELECTs and hey presto the answers were staring me in the face.  So Oracle is eating its own dog food and it tastes nice…  If you have not had time to look at performance_schema or dbahelper yet take a poke.

The current MySQL 5.6 GA version (5.6.16) is pretty good now. I am aware of a few bugs which I am waiting to get fixed, but all in all the version works pretty well and I am happy with it.  I have filed quite a few feature requests and the reason for these is often to make my day to day life a bit easier, as when MySQL breaks, and as you watch over more servers this happens, things which might not bother other people become more of a problem. This includes better logging, addition of counters to measure more things which help when problems occur or to diagnose how frequently a problem might be happening and now this information is missing.

I am still working on getting systems upgraded to MySQL 5.6 and have some work to go. It is surprising just how long this can take, but that is often due to changes in a growing business which means that new systems appear, changes happen to existing ones and the time needed to get the MySQL 5.6 upgrade tasks done gets a lower priority than ideally I might like. That said MySQL 5.5 works and works well but 5.6 does have new things I want and shortly those remaining systems will have been upgraded and the job will have been done.

Well not quite.  Recently I have been looking at 5.7 DEV. A lot of work is going on there. Some of this is to give us new features that I have wanted for some time:

  • multi-source replication comes to mind, but that is a parallel branch to 5.7 which I think is unfortunate:
    • I still think that pulling this out into a separate process would have made life much easier as I blogged about previously, and indeed is how Sybase replicates, but I guess that is too large a change to be considered.
    • It is not clear if this feature will make it into 5.7 GA but I hope it will (Oracle please add it, MariaDB 10 has this feature, and for some use cases it will be very attractive)
  • more dynamic replication configuration is a much wanted new feature. A number of times I have been unable to reconfigure a slave without restarting the server and this has been most frustrating. It may have meant that I could not do the required change at all and or had to find an alternative work around to solve the problem. So this new feature is clearly helpful.
  • more replication parallelisation, again one of the bottlenecks that many DBAs have come across, promises to make life better
  • improved P_S will carry on and give us more of the information that we need.

So a lot of these new features look quite juicy and more stuff will probably arrive in 5.7.4, so the GA version, when it arrives, is going to be good.

One thing that I have been looking at recently is GTIDs in MySQL. This is quite a topic in itself. Oracle has implemented this in 5.6 a certain way and in MariaDB 10 (currently DEV version) the implementation is approached differently, so that should prove quite interesting as neither mode is compatible.  The end result that DBAs want is to ensure that it is easier to synchronise the state of master and slaves, and to prevent the re-application of transactions that have already been applied to a system.  As it will soon be possible to replicate from multiple sources in MySQL and MariaDB then this topic will become more tricky and help from the “system” is much appreciated.

That said while I have used GTID a little on a few systems and it does seem to work and make life easier, it is not quite as nice and easy to use as one would like. Work seems to be ongoing in that direction to improve things and in helping us make the transition from a non-GTID environment to one which is GTID aware.  All that help is most appreciated.

For a replicated MySQL environment that has to run 24 x 7 x 365 downtime is painful to a DBA, and expensive to a business. Things do break. We expect that, whether that is due to bugs, hardware or even human failure.  We then need to be able to recover these systems as best as we can, as quickly as we can and once we have done that be confident that we understand what broke, what damage was done and what further work may or may not be required to complete the job.  So if replication uses GTID that focus has to be understood and taken into account by the developers, and support for GTID should cover those needs.  That is vitally important.

All in all I am looking forward to the preparation for the next GA version of MySQL even if it is still maybe more than 6 months away. It is a bit like planning your summer holiday in the cold winder. Lots of things to look forward to, plenty of interest in trying them, and the slight frustration you have to wait just a little bit longer before that time comes.  I’m looking forward to the sun already…

postfix 2.10.2 RPMs available

December 29th, 2013

I’ve just updated my RPMs to now support postfix-2.10. The Source and binary RPMs (for RHEL5 and RHEL6 x86_64) can be found at the following location http://ftp.WL0.org/official/2.10.

As you probably are aware I am not really following postfix development any longer, but I was recently asked about an issue for an older version, so if you still use these packages and see any issues please let me know and I will try to address them.

Next Madrid MySQL Users Group meeting to take place on 16th January 2014

November 27th, 2013

Yesterday we had our third Madrid MySQL users group meeting. That was quite interesting.  Thanks go to Juan for his presentation.

We plan the next meeting on January 16th after the New Year is out of the way. If you are interested in MySQL and happen to be in Madrid please consider coming to see us.

More information about the next meeting can be found on the group’s web page. Note: The meeting will be in Spanish. I look forward to seeing you.

MySQL RPMS and the new yum repository

November 7th, 2013

I was really pleased to see the announcement by Oracle MySQL yum repositories that they have now produced a yum repository from where the MySQL RPMs they provide can be downloaded. This makes keeping up to date much easier. Many companies setup internal yum repositories with the software they need as then updating servers is much easier and can be done with a simple command. For many people at home that means you set this up once and don’t need to check for updates and do manual downloads, but can do a quick yum update xxxx and you get the latest version. Great!  This new yum repository only covers RHEL6 did not include RHEL5 which is not yet end of life and still used by me and probably quite a lot of other people. I filed bug#70773 to ask for RHEL5 support to be considered, especially as it was being provided already, and asked for them to also consider including 5.7, but have been told that RHEL5 support will not be added. As you will see later perhaps it is now clear why.

I was somewhat surprised then to read a later post Updating MySQL using official repositories which showed that actually the rpms in this new repository are completely different to the previous ones (MySQL community rpms) that we have been downloading from http://dev.mysql.com/downloads/mysql/#downloads. The package names differ being MySQL-server…rpm, MySQL-client…rpm on the latest page whilst the package names in the new yum repository are named mysql-community-client…rpm, mysql-community-server…rpm, etc.  It looks like these packages are incompatible, thus for those of us already using the first set of RPMs the new yum repository is rather useless, and switching from one set of RPMs to another is quite a nuisance if you have a number of servers.

To avoid confusion in any later comments in this article I will refer to the ORIGINAL MySQL packages as the ones provided by Oracle previously and the NEW MySQL packages as these new ones provided in the new yum repo.

No mention of these differences was made by Oracle in their announcements, so obviously no reasons were given for why they might have chosen to name packages differently and not simply provided a yum repository containing the existing packages they have already built. That is what I had assumed they’d done and would have seemed to have been the most logical step to take.

Independently of the reasons that Oracle has decided to build two sets of RPMs, one thing is clear: They now have to maintain 2 different sets of package for the same distribution (RHEL6) in parallel. It also brings up the question of what will happen with MySQL 5.7 that is out there already being offered in the ORIGINAL packaging format and whether or not new 5.7 packages will be provided in ORIGINAL or NEW versions or both….  I have been trying out the current DEV version of 5.7 and indeed I know that 5.7 can change anyway Oracle chooses until they declare the version GA, so fair enough. The only thing it does is raise doubt about how to prepare for that moment as MySQL 5.7 has many interesting features and some of us way want to use those as soon as possible.

Anyway I digress. I thought that I would see what differences there may be between the src rpms of the ORIGINAL and NEW packages. As the src rpms are available I was able to download these packages to see. One would expect them to be pretty much the same, as after all the software is the same. It is also true that the NEW packages must provide support for Fedora releases so some changes would have been necessary to make that work correctly. Other than that I would not expect many differences.

I downloaded the src rpms from both locations, the original sets of MySQL community rpms has a package named MySQL-5.6.14-1.el6.src.rpm, and the new yum repo version is called mysql-community-5.6.14-3.el6.src.rpm. There’s a minor release difference here but that difference should be easy to see in the spec file.

I installed the src rpm which basically unpacks it ready for building.  Given I have setup my rpm directory structure to install the src packages into a directory named after the package it is quite easy to then compare the resulting files contained in the source rpm.  The source rpm normally contain the spec file which defines how the package should be built, the original source tarball files and any patches that may need applying to the original source files to build the final binaries.

The results can be seen below with the first listing being from the ORIGINAL src rpm and the second from the NEW src rpm:

As you see there are several differences just in the number of files that are being provided.  It is nice to see that the NEW package actually contains an older version of mysql (5.1.70) which I’m assuming is used to build the older “shared-compat” libraries. That was missing from the ORIGINAL package and actually meant that it is not completely possible to build the ORIGINAL package binaries only from the src.rpm.

Some of the files in the NEW package are clearly for Fedora’s new systemd, which is the replacement for the traditional init scripts.

I also diff’d the spec files, and there you see several differences. One surprising one is there’s a difference in the licensing: The ORIGINAL rpms have a GPL license specified, whilst the NEW rpms have a GPLv2 license. Oracle never changes things without a reason and I know there have been many discussions on the topic of GPLv2 vs GPLv3 so I wonder why this change exists.

There are quite a lot of other differences in the 2 spec files, some changes seems insignificant but others may not be. I have not had the time to investigate further.

Many of you may remember that we have had bugs related to Oracle releasing multiple tarballs of the same name but having different checksums. Unlike some other package managers, RPM is not good at noticing or recording these differences so they can go unnoticed. I actually filed a feature request to RH to see if this may get fixed. That is their bug#995822.  If you think this may be useful please tell RedHat.

That feature request refers to bug#69512 where this issue came up and I noticed it again in bug#69987 in September. Oracle said they would ensure they would not distribute source tar balls with the same name and different contents to avoid the confusion that can result from this. It seems they have already forgotten as I see:

So yes, it seems they have done this again. I know that they build from a VCS but if they tag something as 5.6.14 it really helps to have a single tar ball that refers to that tagged version. I haven’t bothered to do a diff between the 2 tarballs but can imagine the new tarball is somewhat more up to date than the last one. However, that is really no excuse. So time to file yet another bug report which I have done with bug#70847.

In the end it seems that Oracle is trying to do the right thing and make life easier for us. That is welcome. However, they seem to trip up making that effort. Things need not be this complicated.

2013/11/07 update

I had a quick check of the differences in the 2 tarballs which are hugely different in size:

Despite what looks like a large change set it seems most of the changes are in the documentation (mysql.info and man files) and these seem to be due to the fact that the source files have been regenerated between the tar balls, most of the changes are thus spacing or generated timestamp differences. Also mysql-5.1.70.tar.gz is included in the new mysql-5.6.14.tar ball so really there’s no need for it to be included separately in the mysql-community 5.6.14 src.rpm. That is likely to be an oversight and will no doubt get corrected.

My conclusion is that simply the mysql-community version of the mysql-5.6.14.tar.gz tar ball is simply more up to date than the one provided in the MySQL-5.6.14 src rpm. So perhaps much ado about nothing.  That said a few questions remain unanswered so it will be interesting to see how things develop.

Second Madrid MySQL Users Group taking place tomorrow Thursday, 12th September

September 11th, 2013

If you happen to have some free time tomorrow and are in Madrid please come along to the second Madrid MySQL Users Group.

Details can be found here. The meeting will be in Spanish. I look forward to seeing you.

Google Chrome and CentOS 6

August 27th, 2013

I have been meaning to sit down and write about this for a while now but with the summer upon us have not had time.

I use CentOS 6 on my home PC for many reasons, but mainly because I have been using RedHat versions of Linux for quite a long time and am comfortable with it.  CentOS, like its upstream parent RHEL, is good because it is well tested and stable and it is not absolutely necessary to perform a major version update every few months like its cousin Fedora. I do not have time for that and most security issues will be fixed with normal updates so this works well.

The downside of course is that RedHat can not afford to run bleeding edge versions of software which may be rather unstable and that can be a nuisance, if you need or want to use newer more recent versions.

An example of that, from my DBA perspective, is the version of MySQL which is shipped with RHEL 6.4: MySQL 5.1.66 is very old and I would not really want to run that on a production machine taking anything but toy load.  However, in this case, others provide drop-in replacements: Oracle with MySQL 5.6 (current GA version) and both MariaDB and Percona versions are other good alternatives.

In the browser world I moved over some time ago to use Chrome. It seems to work well on my Mac and at the time I moved over it avoided me having to restart Firefox frequently as it was eating all my memory.  There have been versions of Chrome available for CentOS 6 until rather recently, provided by Google and that was really good. It made life easy and I could switch machine with little trouble and use the same browser.

There are versions of Chrome for other Linux versions: Debian, SuSE and Fedora, but I want to keep using CentOS 6 for the reasons outlined above. Judging from other posts I have seen others feel the same.  My understanding is that CentOS 6 support was dropped as newer libraries on which it depended were not available on CentOS 6: Requires: libstdc++.so.6(GLIBCXX_3.4.15)(64bit) says rpm when you try to install the latest version.

Upgrading the OS versions of these base libraries is just troublesome as it may break several things and while people sometimes do this to try and work around similar issues I do not think that is the way forward.  Something will eventually break. Yet if these libraries are needed why can they not be built and located like google-chrome itself in a different location to the system default, and then link google-chrome using RPATH settings to the expected directory.  I have done some stuff like this before in a previous job and unless I am mis-remembering the procedures this should work fine.  Rebuilding the required libstdc++ in a different location to the default, such as /opt/google/lib64, and naming it something like google-libstdc++, and then adjusting the google-chrome dependencies to require this package and not the system libstdc++ is not really that hard. Doing so would fix the issue for google-chrome, and potentially make the library usable for others who might also want this newer version on CentOS.  Given Google has already setup a yum repo for CentOS 6, the automatic install of these new dependencies would just work out of the box.

Others suggest moving over to Chromium, but I see it is not the same as Chrome and has given me a few issues since switching from Chrome. I would like to use the same browser on my different desktops and not have to worry.

It would also help if RedHat would recognise that some people will need to use newer versions of libraries that come with the OS, and if they do not want to make the effort to support this themselves, it would help if they would perhaps make recommendations on how this can be done by others, perhaps along the lines outlined above.  In the end for them this avoids people being frustrated that their “stable OS” is not out-of-date, and also avoids issues with people hacking things around themselves, or resorting to building from source, when the whole point of a package based system is that you should not need to do that and the packaged builds tend to be more consistent and complete.

So here is to keeping my fingers crossed that perhaps we will see Chrome on CentOS 6 return in the future and make a few people happier.

What do you think?

My MySQL bugs and feature requests

June 25th, 2013

My MySQL bugs is a list I recently created and intend to keep up to date with issues I have seen.

First Madrid MySQL Users Group scheduled for the 4th July

June 19th, 2013

As mentioned here and after talking to a few people we have created a meetup page, http://www.meetup.com/Madrid-MySQL-users-group/ and proposed the first meeting on Thursday, 4th July. If you are interested and in the area come along and say hello. It should be interesting to see a few others who work in the field.  If you can let us know you are coming so we have an idea of how much interest there is.

On operating system upgrades and a packager’s nightmare

June 18th, 2013

A fairy tale

Once upon a time I did an operating system upgrade, a minor one that should do no harm, but just get me up to date by fixing any bugs in the version I had been using. It seemed like a good idea.

All seemed to be fine. I use a package provided by an external vendor and not the one produced by the operating system provider as this vendor provides a newer version of the package and I need that. The vendor has to make his package fit in the os environment his package is built for and normally does a pretty good job.

I use automation to build my systems and when I built a new one some issues appeared. Related to the new version of the OS the provider had enhanced one of his packages and the installation pulled in new dependencies. The install of the external package I use then broke as it conflicted with the new dependency provided by the OS.  While a workaround is possible: uninstall this, install that, … the nice world the package manager provides you to make these problems a non-issue was suddenly broken. Horrible.

To make matters worse the external vendor can not easily fix this. If he builds a package which obsoletes this new dependency then in earlier versions of the OS he changes stuff that is not needed. If he does not do this the breakage remains in this latest version. Of course in the real world many people use a mixture of OS versions and expect a single package to do the right thing at least for the same major version of an operating system. An external vendor’s packagers life is not a happy one. It’s hard to provide a package that integrates seemlessly without it breaking something.

Back to real life

The story above is a real story relating to RHEL 5.9 versus the previous RHEL 5.8, or CentOS in my case and MySQL 5.5 and 5.6 rpms provided by Oracle. RedHat have enhanced the version of postfix they provide (a core component of the servers we use) and added MySQL library dependencies which were not there in RHEL 5.8. This pulls in the ‘mysql’ package as a dependency, and so installing the MySQL-server, MySQL-client and MySQL-shared-compat libraries I had been using breaks as these do not currently have an rpm “Obsoletes: mysql” entry needed to make the package install go nice and smoothly.

Real life means I do not run the latest version of MySQL 5.5 but a version which works on my systems and has been tested with my software, so if I use that on CentOS 5.9 now it will break. Oracle may not mind having to update their latest MySQL 5.5 rpm (for CentOS 5) but are unlikely to be interested in updating all their previous MySQL 5.5 or 5.6 rpms for this version of CentOS just in case I might be using one of their older versions. I also am not interested in upgrading all these MySQL 5.5 servers just to get a newer version which may have issues, so probably I will end up doing a work around “remove this package”, “add that one” in a adhoc way to avoid this problem. It should not be necessary.

It has always been stated when building software: do not make functional changes in a minor version upgrade as you will break something somewhere and upset someone. I understand that providing MySQL support to postfix (and perhaps other packages) seems like a good idea but this breaks this “rule”. I also think that having to provide a package within an existing OS framework, and try to avoid breaking any native OS packages is hard, but still think as I have commented in previous posts that coming up with a way of allowing external vendors a way to do this which does not interfere with “native OS” packages would be good. Recommend that these packages get named a certain way, do not get located in “native OS” locations, and do not conflict with “native OS” libraries.
If people want to use these packages show them how to link against them and add dependencies in a way which may favour these vendors’ packages if that’s a deliberate intention. Provide these rules and life will be easier for all of us. One world where this is done is the *BSD ports systems, which are intentionally independent of the “native OS” and located separately and people can use them if needed, or not at their convenience. In my opinion something like that in the rpm world would be nice indeed and external rpm repos would not have to worry so much about adding new functionality and breaking the OS.

I am not fully sure how I am going to work around this issue, but it has been causing problems for a while now and will require some poking. I will try and get in touch with the parties concerned to alert them to this change which has broken something for me.

Am I the only one who has noticed this?

Also likely to be an issue I guess for MariaDB and Percona rpms for RHEL 5/ CentOS 5 as they’ll likely implement the same dependency behaviour.

References:

  • http://blog.wl0.org/2009/08/a-packagers-thoughts-on-rpm8/
  • https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/5/pdf/5.9_Technical_Notes/Red_Hat_Enterprise_Linux-5-5.9_Technical_Notes-en-US.pdf

Update: 2013-06-19

This is what I saw:

On a server that runs MySQL 5.5.23 (CentOS 5.8) I see:

which actually looks wrong. CentOS 5 does not have a mysql-libs package, but a package named mysql.  CentOS 6 does have such a named package and on early releases of the MySQL 5.5 rpms some packaging issues similar to the one described here did have to be ironed out.

Checking a later package I see things have changed but there still seems to be an issue.

The first two entries need fixing (first first needs no version and the second should not be there) as they cause other issues, but I had not noticed until now the package name mistake in the last line (which mentions a package in CentOS 6, not CentOS 5 and should be just ‘mysql’).

Maybe, given I am not using the latest version of MySQL, I will need to repackage this version myself. A quick change to the rpm spec file and a rebuild should give me the package I need. Given the src rpm is available it is good I can do that.   …. but no. Further inspection shows that the MySQL-shared-compat package is not built from the MySQL-X.Y.Z-r.src.rpm but is built independently of it and I think old libraries files are “bundled” in manually to make this package.  I vaguely remember that maybe the source tar ball has the spec file for this shared-compat package. Let me check as maybe I have to do some deeper magic to roll my own package.

Oracle: please do not do this. Provide enough of the source in the normal tar ball to be able to build the old version libraries, and build _everything_ from that single tar ball.  This would avoid all the magic you are doing now to build this package and others who need to adjust these packages can do the same.

Madrid MySQL Users Group worth creating?

June 6th, 2013

I’m interested in meeting up and sharing experiences about using MySQL and managing MySQL servers with people here locally in Madrid. I had a quick look around and could see no MySQL user groups locally, so it might be nice to create such a group and exchange ideas over a beer, coffee or cola every once in a while. If you’re in Madrid and are interested please let me know. I’ve created a temporary  email address: madrid-mysql-users-2013 AT wl0.org (careful with the domain), which you can contact me on to confirm an interest.  Oh and I’d expect these meet ups to be in Spanish, but that’s not a requirement.

Estoy interesado en reunirme y compartir experiencias sobre el uso de MySQL y administración de servidores de MySQL con la gente aquí en Madrid. He echado un vistazo en Internet y no he visto ningún grupo de usuarios de MySQL a nivel local, por lo que sería bueno crear un grupo e intercambiar ideas mientras tomamos una cerveza, un café o una coca-cola. Si estás en Madrid y estás interesado por favor házmelo saber. He creado una dirección de correo electrónico provisional: madrid-mysql-usuarios-2013 AT wl0.org (cuidado con el dominio), que puedes usar para ponerte en contacto conmigo para confirmar tu interés en asistir.  Estando aquí en España imagino hablaríamos en español.

 Update: 2013-06-19

Progress, slowly but surely. See: http://www.meetup.com/Madrid-MySQL-users-group/