Time to get some 128-bit types into MySQL?

I think that getting 128-bit types into MySQL would be good. There are a few use cases for this and right now we have to work around them. That should not be necessary.  While not essential they would make things easier.

The headline is easy to understand, but is this really needed?

First we need to look to see where this might be used. I can think of three different 128-bit types which are missing at the moment:

  • IPv6 addresses
  • uuid values
  • a bigger value than (signed) bigint [64-bit numbers]

IPv6 Addresses

IPv6 addresses are 128-bit numbers, and having a native way to store them would be really helpful. Given this also includes an IPv4 representation then for those people who store IP addresses (client connections and other things) such a native type would be much better than the typical unsigned int or binary(4) which you might be using now. Is this an IPv4 address? Well it might be, but it also might not be.  The same applies to IPv6, and having a real IPv6 type makes this knowledge more explicit.

MySQL already provides support routines for IPv4 (even if the type does not exist) such as INET_ATON(), INET_NTOA() so a similar set of routines would be needed to support this type, converting between their text and numeric representation and also for converting between IPv4 and IPv6.

UUID Values

MySQL itself uses UUID values in 5.6 and above as the server_uuid, but it’s stored or seems to be as a string. Other software (MEM is a good example) also uses UUID values in various places.

Have a look on search engines for MySQL and UUID and you see lots of questions on how to best store these values in MySQL. So there is already a demand for this, and no good answers as far as I can see.

One common concern I have currently when storing such values as binary(16) is that the values are hard to visualise, especially if used as a primary key, and also from the DBA’s point of view who may want to “manually” access or modify data  it is not possible to do something similar to SELECT name FROM servers WHERE uuid = ‘cd2180ae-9b94-11e2-b407-e83935c12500′, as this just does not work. Casting could make this work magically but right now it’s much harder than it should be.  There is not a single UUID format but the basics are the same and if we had a uuid format any supporting routines (which would be needed) would be able to convert as needed.

Signed or unsigned integers

Yes, the (signed or unsigned) bigint type gives us 64-bits and that allows for huge numbers but one size bigger matches the use cases above, so it’s good to be able to convert between them depending on the usage.  That is if we’re going to have IPv6 and UUID type values, it makes sense to allow an integer equivalent representation and sometimes this might be needed when stripping out parts of a uuid, or parts of an IPv6 address.  The name of this type should be something a little better than we’ve seen before so hugeint (unsigned) would not be what I would suggest. Something as simple as int128 (unsigned) would be much easier to understand.

Conversion routines

Each of the three types above need routines to support their “native” usage and probably converting from / to numeric or text representations of the value.  Given the three types have the same size then it may also be useful to convert from one format to another. The actual content would not change, just it’s representation. Included with this would be a BINARY(16) so that people who might have had to use other MySQL times to represent these values have an easy way to convert more explicitly to them and if for any reason a conversion back is needed this is also possible.

ALTER TABLE should be aware of these equivalents too so if I have a table defined with a BINARY(16) I can convert it to an IPv6 address/type as a no-op operation (definition only change), in a similar way as can be done with some other conversions (ENUM being a common type that changes but if you add a new value there’s no need to check the table for existing values as the old definition was a subset of the new one).

No incompatible changes in minor versions please

A change such as this can not reasonably be added as a minor version change as if we would break many things.  Minor versions should really, really only included bug fixes, or performance improvements, and if a new feature really has to be added by default it must be disabled (for compatibility) and enabled with some sort of special option. Given there’s no agreed way to do this and it is likely to cause all sorts of issues, just do not do it.

That means that a feature such as this can only be added in a new version such as MySQL 5.7 or MariaDB 10.1 both of which are DEV versions, and so allowed to change in any way their authors deem reasonable. I have seen no indication of 5.7 including this functionality and given the time that 5.7 has been about I am inclined to think that an extra change such as this is unlikely to make it there. So MySQL 5.8 then? MariaDB 10.1 development has not been ongoing for that long so maybe such a feature might be considered there.

In the end we do need these new features and long lead times to make them available is a considerable source of frustration for those of us who have a number of systems to upgrade.  One thing is a new version going GA, but it’s something else to have all systems upgraded to use that version and thus make it available to developers.

Whatever happens it would be really helpful if the different “MySQL vendors” talk to each other, if they agree that this is a sensible path to take. Having various different interpretations of how these new types should be stored, converted and which associated functions etc are needed would be a user or developer’s nightmare. I understand there is competition, but for something like this it is really important to get it right.  The first implementor of such a feature would potentially have an advantage over the others but I would expect usage of this type of data types to be quite popular so agreeing generally on what to do should not be that hard and avoids the different forks from drifting off further apart, something which I think is bad for everyone concerned.

Conclusion

Some people I have spoken share the opinion that having such a set of 128-bit types would be good. It is something else of course to implement that.  For those looking for new features to develop in MySQL this is one which in theory is not absolutely necessary but which I think would not only be popular but would be used.  In the end MySQL is there to store data, and make it easy to retrieve and it seems clear to me that this type of data is one such usage which while it can be handled differently would really welcome “native” support. I hope that this will happen sometime soon.

Update 2014-07-03

MariaDB seems to have some support on its way for this. Referenced on maria-developers on 1st July, details can be found here:

If a plugin type is available for IPv4 that might be good as well.

This looks like work in progress and there’s no mention of a 128bit (unsigned) int, or how to convert between different values, but this looks like a good start. In fact if it’s possible to make these types available via a plugin interface this does seem to add the possibility of adding new special types even once MariaDB is working, so it makes it easier to expand functionality later.

In terms of routines that probably should be available in MySQL to support some of these types the following stand out:

  • INET_PTON() and INET_NTOP() to supplement the existing INET_ATON() and INET_NTOA() functions.
  • GETADDRINFO() and GETNAMEINFO() to convert between IPv4 or IPv6 addresses and names. existing INET_ATON() and INET_NTOA() functions.
  • Something like  UUID_LONG() to generate a 128-bit numeric equivalent of  UUID(), and functions to convert a text-based uuid into a number and a something to convert back again, STRING_TO_UUID() and UUID_TO_STRING() unless there already exists some standard function name for these tasks.

I think all of these look like useful routines to go with the types above. I’ll add more as I think of them.

Update 2014-07-11

I also see this very old bug referenced in the mysql bug list: http://bugs.mysql.com/bug.php?id=15940

Tags: , , , ,

3 Responses to “Time to get some 128-bit types into MySQL?”

  1. Justin Swanhart says:

    btw, I have a set of UDF in Swanhart-Tools that allow arbitrary precision arithmetic using the GNU bc library.

    You can find it here: https://github.com/greenlion/swanhart-tools

  2. Wouldn’t there be cases as well where DECIMAL and FLOAT types could benefit from 128 bit algorithms (better precision in FLOAT calculations. DECIMAL/precision math support for very small and very large numbers (as used in various branches of physics like astronomy , atomar physics etc.)?

    • Simon J Mudd says:

      Hi Peter,

      Yes, your point is valid. If we’re going to have a few 128-bit types then it might be worth allowing the extra space for other types too.
      * DECIMAL() is always interesting if you need precision, and often used in banking
      * float and double, while ok, in some cases may not be enough. I see there is a quadruple float size, http://en.wikipedia.org/wiki/Quadruple-precision_floating-point_format, so again supporting a type like that would be good, even if it’s something that many people might need. The only issue I see with a quadruple precision floating point number is the name. A quick look suggests it’s called binary128 by IEEE, long double by some and double double by others. The IEEE name might not really be good as it could confuse things with MySQL binary (in bytes) types.

Leave a Reply