Checking /proc/pid/numa_maps can be dangerous for mysql client connections

I’ve blogged before about the way to use numactl to start up mysqld, and thus to try to better spread the memory usage on larger memory servers. This came from an article by Jeremy Cole and is fine. I recently had some issues with mysqld seeming to run out of memory on a box which appeared to have plenty free, so it seemed like a good idea to adapt a minutely collector script I run to include the numa_maps output so that I could see if the failed memory was related to this. So far so good.

Many of the clients that connect to the database servers I manage have a very short connect timeout, typically 2 seconds. In a normal network and under normal conditions this is more than enough to allow for successful operation and if the connect does not work in that time it’s an indication of some underlying issue, whether that be load on the mysqld server, or something else.

The change I implemented on to collect the numa_maps information wasn’t expected to cause any issues, and typically quite a lot of information is cat’d out of /proc/somewhere to collect configuration or status information.

However, after implementing this change I suddenly noticed more mysql client connection errors than before, in fact a significant change. It wasn’t immediately apparent what the cause was until it was noticed that these errors only occurred a couple of seconds after the change in minute, at hh:mm:02, hh:mm:03 etc. Then it dawned on me that indeed this was due to the change in the collection script looking at /proc/<mysqld_pid>/numa_maps.   Disabling this functionality again removed all issues. This was with servers with 192 GB of RAM.

The information provided by /proc is useful and it would be even better if the information could be collected in a way which doesn’t block the process that’s being “investigated”. As such I have filed a bug report with RedHat, though really this is just a kernel “bug”, or behaviour which while it may be known, in this particular case provides a very unsatisfactory behaviour.

So whilst most people do not configure such a short mysql connection timeout if you do have a setting like this please be aware of the consequences of running, or looking at /proc/<mysqld_pid>/numa_maps directly. This was not discussed by Jeremy and I’d guess he was unaware of this, or did not expect people to run this type of script as frequently as I was. Database memory sizes keep increasing so this may not have been noticeable on smaller servers but can now become an issue.

Tags: , , ,

4 Responses to “Checking /proc/pid/numa_maps can be dangerous for mysql client connections”

  1. Jeremy Cole says:

    Simon: That’s a good thing to note. You’re right that I didn’t expect anyone to run this so frequently, but I did know it blocks the process. I assume that reading numa_maps blocks all memory allocation to build the map table, which also ends up blocking new connections and causing timeouts as you observed. This may be unavoidable for this data (the numa_maps file), but it could easily be replaced by some simpler counter which summarizes allocations for a process by NUMA node.

    I’ll add a note to my blog post with a warning though!

  2. Simon J Mudd says:

    Well, I’ve configured my DB servers to not overcommit memory (vm.overcommit_ratio = 100, vm.overcommit_memory = 2) and am still investigating why mysqld reports out of memory issues when I seem to have 20GB free. I’m not sure how to see the NUMA memory maps for the whole server on a per process basis so I can see if one of the node’s memory is full which might explain this. Looking just at the info for mysqld is not enough.

  3. Der Herr House und der Herr Heisenberg haben Replication Delay…

    Heute erreicht mich eine Mail, in der ein DBA sich über steigende Replication Delay in einer bestimmten Replikationshierarchie beschwert. Das ist schlecht, denn die betreffende Hierarchie ist wichtig. Also die ‘Wenn die nicht geht schlafen Leute unter…

  4. MySQL-dump says:

    House and Heisenberg having Replication Delay…

    So I am getting a mail with a complaint about rising replication delays in a certain replication hierarchy. Not good, because said hierarchy is one of the important ones. As in ‘If that breaks, people are sleeping under the bridge’-important. The th…

Leave a Reply