Archive for September, 2012

Checking /proc/pid/numa_maps can be dangerous for mysql client connections

Saturday, September 8th, 2012

I’ve blogged before about the way to use numactl to start up mysqld, and thus to try to better spread the memory usage on larger memory servers. This came from an article by Jeremy Cole and is fine. I recently had some issues with mysqld seeming to run out of memory on a box which appeared to have plenty free, so it seemed like a good idea to adapt a minutely collector script I run to include the numa_maps output so that I could see if the failed memory was related to this. So far so good.

Many of the clients that connect to the database servers I manage have a very short connect timeout, typically 2 seconds. In a normal network and under normal conditions this is more than enough to allow for successful operation and if the connect does not work in that time it’s an indication of some underlying issue, whether that be load on the mysqld server, or something else.

The change I implemented on to collect the numa_maps information wasn’t expected to cause any issues, and typically quite a lot of information is cat’d out of /proc/somewhere to collect configuration or status information.

However, after implementing this change I suddenly noticed more mysql client connection errors than before, in fact a significant change. It wasn’t immediately apparent what the cause was until it was noticed that these errors only occurred a couple of seconds after the change in minute, at hh:mm:02, hh:mm:03 etc. Then it dawned on me that indeed this was due to the change in the collection script looking at /proc/<mysqld_pid>/numa_maps.   Disabling this functionality again removed all issues. This was with servers with 192 GB of RAM.

The information provided by /proc is useful and it would be even better if the information could be collected in a way which doesn’t block the process that’s being “investigated”. As such I have filed a bug report with RedHat, though really this is just a kernel “bug”, or behaviour which while it may be known, in this particular case provides a very unsatisfactory behaviour.

So whilst most people do not configure such a short mysql connection timeout if you do have a setting like this please be aware of the consequences of running numa-maps-summary.pl, or looking at /proc/<mysqld_pid>/numa_maps directly. This was not discussed by Jeremy and I’d guess he was unaware of this, or did not expect people to run this type of script as frequently as I was. Database memory sizes keep increasing so this may not have been noticeable on smaller servers but can now become an issue.