WEB Advent 2009 / Dæmonize Your PHP

Despite the haters, I actually really like PHP. Well-written PHP strikes a solid balance between getting things done and getting things done properly. It also lives somewhere comfortably between development time and performance. It’s — at least in my opinion — the best tool for creating web apps.

Sometimes, though, we’re called on to write bits of code that live outside of our arena of the Web: cron jobs, mail processors, Subversion hooks, or even asynchronous workers.

There are often better tools than PHP to accomplish these tasks. For example, if Procmail can solve your problem, you can avoid writing a lot of nasty mail parsing logic — even if you do get a little help. Although PHP does have some ability to perform tasks normally delegated to systems engineers, most developers with a bit of experience in Python will tell you that it’s a better choice for things like forking and creating network dæmons, and I agree.

In my opinion, there are two major reasons to use PHP outside of the Web — and they’re very good reasons when they’re your main reasons:

  1. Familiarity: Many PHP developers will have dabbled in other technologies to pragmatically scratch normal development itches, but — especially at junior- and mid-levels — they’re most comfortable with PHP, because they’re fully immersed in it for their whole professional career. They can crank out PHP code faster than they could even remember the syntax to operate on files.
  2. Code re-use: A side effect of full immersion in PHP is that you’ll have plenty of PHP code laying around, ready to be employed in new and unexpected ways — especially if it’s well-designed. There’s no need to reinvent the wheel for most non-web tasks, and doing so is often detrimental to maintenance and predictability. You wouldn’t want to implement password hashing code in two separate languages, in two places.

One unintended place that I’ve been using PHP lately is in long-running processes. PHP 5.3’s improved garbage collection helps in certain situations, but what I really needed was something that would not only keep my processes running, but would also keep track of logging, concurrency, and crash recovery. When an Apache process crashes, or reaches its maximum number of requests, another one spins up in its place with little negative side effect to users. I wanted something like this, but I certainly didn’t want to write it.

Luckily, I’m not the first person who envisioned such a thing. After toying with the ambiguously-named Daemon (which I wasn’t able to compile and execute properly on OS X) for a work project, my friend Andrei Zmievski inadvertently pointed me in the direction of Supervisord, which was created in part by Mike Naberezny, whose name you might recognize from his past gig as the primary maintainer of Zend Framework.

Supervisord, through a very simple configuration system, allows me to create simple long-running processes without having to worry about the the support infrastructure. It handles the things I mentioned above with ease, and keeps my scripts running without worry.

One such script is a fun little service I’ve attached to Twitter called @beerscore. This bot allows me to DM beer names via Twitter, and within a few seconds, I’ll receive a reply with the percentile rating for the beer I’ve requested. This is particularly useful when I’m traveling and don’t necessarily have web access, but I’ve almost always got SMS coverage. (If you’d like to play with it, you must follow @beerscore in order to send a direct message.)

The code behind @beerscore runs constantly to follow-back users and to serve scores. If it crashes for some reason, a new copy is spawned. Its output is logged, systematically. It automatically launches when my server boots, and I don’t have to worry about remembering which screen it’s running in. Supervisord handles all of this, so I don’t have to.

I won’t go into the process of installing Supervisord, because that would likely make for a boring article. I had to install it manually on my Ubuntu server, so that was the only tricky part, but once it’s running (and set up in /etc/init.d with an appropriate script), it’s very easy to use.

I also won’t get into the details of all of Supervisord’s configuration details, which might help cover some of our missing Advent articles, but would certainly turn off the majority of users due to boredom.

My requirements for @beerscore — as a long-running process, or “dæmon” — were simple: Keep it running, pass in some environment variables, and keep a log that I can review in case something goes wrong. Here’s a copy of my configuration file, beerscore.ini:

[program:beerscore]
command=/usr/local/bin/php beerscore_bot.php
numprocs=1
directory=/home/sean/findpint/twitterbot
stdout_logfile=/home/sean/findpint/twitterbot/supervisord.log
environment=BEERSCORE_USER=beerscore,BEERSCORE_PASS=[redacted]
autostart=true
autorestart=true
user=beerscore

Hopefully, this is simple enough for you to grok at first glance, but if not, the manual should cover everything you need to know. The command is essentially what I would execute on the command line. I’m running a single process (numprocs), and if that process isn’t running or dies for whatever reason, it’s automatically respawned (autostart and autorestart, respectively). I’ve specified a directory, and some environment variables, and it’s even running as its own user. Anything that @beerscore outputs (via a simple echo in PHP) is logged to the stdout_logfile.

Supervisord handles the hard parts and lets me focus on the logic of actually getting top quality beer from menu to belly. Hopefully, if you’ve got a similar need — handling long-running processes, that is; everyone needs beer — you, too, can employ this tool. If my above description didn’t quite whet your appetite, there’s even a shiny web interface that will let me keep track of my cluster of @beerscore raters when it needs to serve a few million scores per day.

Other posts