MySQL and Linux swappiness

2010.07.29

MySQL’s InnoDB engine is really great. Row-level locking is amazing in tables where there is heavy concurrency. Write buffering is also awesome for cases where a table needs to accept a lot of data. InnoDB’s use of memory to store indexes or sometimes the entire table can also make reads incredibly fast, especially on tables that need to support complex queries where even the best placed indexes do nothing.

However, when tables get large, the innodb_buffer_pool is set to close to amount of memory on the server, Linux has a tendency to remove your data from memory for no good reason. The symptoms are unmistakable: a query that was known to be pretty quick, but hasn’t run in a while, will take long. Too long. Run it again, and it becomes snappy. What’s happening is that when the query initially runs, the necessary data isn’t in memory, so it’s read in from disk, and the query is performed. Once it’s in memory, that second run is quick.

Actually there is a good reason Linux behaves like this:

“My point is that decreasing the tendency of the kernel to swap stuff out is wrong. You really don’t want hundreds of megabytes of BloatyApp’s untouched memory floating about in the machine. Get it out on the disk, use the memory for something useful.”

http://kerneltrap.org/node/3000

This all makes sense, as most systems need to reclaim memory from applications that aren’t doing anything. Except in the case where you have a large dataset in InnoDB that you’d really like to be in memory when you query it. Luckily, there is a tunable that you can change to dictate how aggressive Linux is reclaiming memory from applications. /proc/sys/vm/swappiness stores a number for 0 to 100, where 100 means that Linux will be extremely aggressive in reclaiming memory, and 0 means that memory won’t be reclaimed all that much.

For servers that need to keep datasets in memory all the time, this variable can be extremely helpful. With an InnoDB table / indexes that consume ~80% of memory on the machine, a swappiness value of 30 is sufficient to allow MySQL to keep most of that in memory. Of course, I don’t recommend this for a machine that is not 100% dedicated to a single task. However, on dedicated MySQL machines, tuning this variable can be really helpful.

Categories : MySQL

Out of the Terminal

2010.07.20

Every once and a while I get to leave server-land and get to do some fun projects that involve doing something on the front end. The latest was building an embed script for the Behance Job List. Projects like this, that get me out of the terminal and into a space that requires a bit more interaction between domains, are particularly appealing. As much as I think the Same Origin Policy is reasonable rule for security, I love looking at ways to get around it.

The technique I chose for this was JSONP, or JSON with Padding. I’m a huge fan of JSON as a transport, as I feel it is compact, flexible and stupidly simple to generate and consume. In fact, I’ve sworn to never touch another XML file as long as I live. JSONP is really convenient from a API implementation perspective, because when the request for the data is made (via the script tag), all the client has to do is pass a callback and it can use the data in any way it chooses. The server doesn’t have to be aware of what the callback actually does, although I do recommend checking against a list of pre-approved callbacks, just to make sure.

Like any semi-decent developer, I have dog-fooded my own work, and implemented the Behance Joblist embed code right here.

A little about the Behance Joblist:

Top global companies find and hire talent on Behance, the world’s leading network for creative professionals.