Gearman

2009.07.25

Users have high expectations of web apps in terms of performance, responsiveness and tons of features. Normally, you’re only allowed two of any list of three really cool things. In the case of Web Apps, that would be true. Most will find some compromise of between performance / responsiveness and tons of features. More features usually equals less responsiveness, depending on the feature and scale.

Enter Gearman. Gearman is a queuing system that allows work to be farmed out to other servers. Most importantly, it allows for intense tasks to be queued and performed in the background. This means that when a user performs an action that could potentially take a long time (sending notification emails, updating Full Text indexes, etc), that slow task can be queued to run in the background, and the page can be sent to the user, keeping things snappy.

Gearman is pretty simple to install on Red Hat.

download gearman from server
> wget http://launchpad.net/gearmand/trunk/0.8/+download/gearmand-0.8.tar.gz

unzip and move into the directory
> tar -xvzf gearmand-0.8.tar.gz
> cd gearmand-0.8

Red Hat didn’t have some dependencies. The next few steps will vary depending on your *nix distro.

Install the libevent developer library.
> yum install libevent-devel

Install the e2fsprogs developer library
> yum install e2fsprogs-devel

configure and install
> ./configure
> make
> make install

/** Net Gearman **/

download php extension from the pecl repo
> wget http://pecl.php.net/get/gearman-0.4.0.tgz

untar
> tar -xvf gearman-0.4.0.tgz

build the extension
> phpize
> ./configure
> make
> make test
> make install

Add the extension to the php.ini

[gearman]
extension=gearman.so

And you’re all set!

Integration will depend on if you decide to use the php extension, and how encapsulated the code base is. I highly recommend using the pecl extension, as it provides great implementations of the client and worker. and Gearman will save you.

Save MySQL

2009.07.16

Runaway queries on MySQL can be a real problem. If a long-running query locks up important tables, other queries trying to query the table will will placed in a queue. Each new query is a new connection to MySQL. Once you hit max_connections, your MySQL connection code will start to fail. Depending on how errors are handled at this stage of the request, this could mean total disaster for a site.

Although there is no way to fix this within the MySQL server itself, a bit of clever scripting can be run via cron to check if there is a problem. Presenting : save_mysql

/usr/bin/mysql -e ’show full processlist \G;’ 2> /dev/null |
grep -A1 -B5 -E “Time: [1-9][0-9][0-9]?” |
grep -E “\bId\:\ |\bState\:\ ” |
/usr/bin/perl -n -e “if( $. % 2 ) { chomp $_;print $_; } else { print $_; }” |
grep -E “\ State\:\ Sending\ data$|\ State\:\ Sorting\ result$” |
awk {‘print $2′} |
xargs -iTHREAD -r -n1 /usr/bin/mysqladmin kill THREAD &> /dev/null

/usr/bin/mysql -e ’show full processlist \G;’ 2> /dev/null
This line will grab a list of all the currently running queries and commands from the MySQL server. It also redirects any error output to the blackhole. It produces output like so:

*************************** 1. row ***************************
Id: 842863
User: admin
Host: localhost
db: NULL
Command: Query
Time: 0
State: NULL
Info: show full processlist

grep -A1 -B5 -E “Time: [1-9][0-9][0-9]?”
The grep here will grab line directly below and the 5 above if the time is over 100 seconds. This line can be tweaked to grep for less time. My preference is between 30 seconds and a minute. So instead of
[1-9][0-9][0-9]
you’d have
[3-9][0-9] (30 seconds) OR [6-9][0-9] (60 seconds)

grep -E “\bId\:\ |\bState\:\ ”
This will filter out the other lines from the previous grep and just grab the MySQL process ID and it’s State.

/usr/bin/perl -n -e “if( $. % 2 ) { chomp $_;print $_; } else { print $_; }”
Quick Perl script to put id and state from the step above on the same line.

grep -E “\ State\:\ Sending\ data$|\ State\:\ Sorting\ result$”
This line will filter out the queries being run that are in the state ‘Sending Data’ or ‘Sorting Result’. These are both states where it’s safe to kill the query.

awk {‘print $2′}
This line grabs the query ID from the output.

xargs -iTHREAD -r -n1 /usr/bin/mysqladmin kill THREAD &> /dev/null
Lastly, this line will grab the ID from above to the mysqladmin kill command, effectively killing the query.

Success!

2009.05.30

When making changes to a large site, it’s really helpful to have tools to measure how those changes affect performance. One of my favorite tools is cacti. This is a graph of the load average of one our database servers

Database Load Average

Database Load Average

We done good…

A tool to DRY off

2009.05.19

Every developer worth their bits knows that code repeated is a maintenance problem waiting to happen. However, code written by a group of devs under tight deadlines tends to get pretty ugly pretty quick, with lots of snippets being copy/pasted because ‘they work’. The allure of getting things up and running quickly is a siren call that constantly lures us away from the all-important refactoring and integration that makes code maintainable. But once the dust has settled, and there is a spare moment to re-read and consider what should be changed, the task of refactoring seems too daunting to even bother.

Thankfully, Sebastian Bergmann has created a tool that will find every dirty little Ctrl-V. It’s called the php Copy Paste Detector, and can be installed using pear. Or download the source from git.

What’s really interesting is when you play with the number of tokens and number of lines that constistutes a copy-paste. For my purposes, I used a minimum of 5 lines. In quite a few cases, the copy/paste turned out to declarations, or including the same style sheets and scripts on different pages. But when it was php, it was abundantly clear what needed to be refactored, and how.


Categories : Best Practices   Tools