Friday, January 23, 2009

Automatic Updates: Frying Pan vs Fire

Ah, yet another IT related disaster. They make such great blog posts! This time, though, unlike my last one which was a clear cautionary tale about backups worth of a Grimm Brothers tale, the story of the poor boffins at Sheffield Teaching Hospitals Trust has no such simple answer.

Search around with any kind of security related checklist, column, book, blog, or chewing gum wrapper, and one of the constants will be to make sure that all of your security patches are applied. After all, you can be pretty sure that if the hole is sufficiently well known enough for the bad guys to be ready to exploit it. The simplest, most effective, and efficient way to do this is to simply enable whatever automated mechanism your OS has, whether it's up2date on RedHat, updatesd on Fedora, or Automatic Updates on Windows.

In the case of Sheffield, they opted to disable Automatic Updates, and were promptly rewarded with a hospital wide virus outbreak. While the hospital was at least wise enough to design their workflows such that they were able to maintain an acceptable level of patient care, at a minimum they're throwing money out the window on cleanup efforts, including virus removal, and secondary effects such as having to reschedule non-critical procedures.

At first glance - and in just about any other such tale - the moral would be a simple "Leave Automatic Updates on!" But there's a catch. Why were Automatic Updates disabled, you ask? As a matter of fact, until just a few days prior to the outbreak, they were not only enabled, but had a domain policy ensuring that it remained enabled, verified the patches got installed after an internal testing period, and forced a reboot to make the patches take effect, rather than leaving the machine running with the old vulnerable code still in memory.

In exchange for their diligence, Sheffield ended up with a PC deciding to reboot in the middle of a surgery. Can you just imagine being the poor front line helpdesk schmuck who has to explain to a surgeon why his computer decided to reboot all of a sudden? Trying to tell him or her that it's really for the best with a straight face?

Security patches are a critical part of ensuring security of any computer system. Not applying them entails risk; however, given that applying these patches will by definition change behavior somehow, applying them carries its own risk. For far too many computers (mostly, but by no means limited to, Windows), sprinting along on the patch treadmill is the only line of defense against any other machines on the same network. The OS itself is brittle, with nearly any intrusion easily leveraged into total control of the entire machine. Progress is being made, such as UAC on Windows, or SELinux, but that doesn't help with the millions of legacy machines already out there.

Patching these days is a nasty catch 22. With every patch release, you have to take a guess which will be worse - the fallout from applying the patch, or the fallout from not applying the patch. Admins with strict requirements for both availability and security are stuck walking a narrow path, without even any assurance that the patch even exists.

Friday, January 2, 2009

Journalspace Gets Creamed

If you can't be a good example, then you'll just have to serve as a horrible warning. — Catherine Aird

By now, I'm sure just about everyone will have heard about the disaster that has fallen upon the poor SOBs at journalspace.com. The short story is that the server that hosted all of the data for the blog site got hosed, and lost all of the data. Some of the high points:

  • The data were stored on a RAID1 array - a pair of mirrored drives
  • There were no bakcups, or any backup system in place at all
  • The drives did not fail, but were both completely overwritten on every block
  • No conclusive root cause was found, but a recently departed sysadmin had already been caught doing "a slash-and-burn" on other systems

So in the end, it looks extremely likely that an incompetent sysadmin set the system up with no meaningful backups, and then progressed to a malicious sysadmin by performing a thorough wipe of the only copy of the system data as he was shown out the door. What a wonderful cornucopia of lessons that can be gleaned from this one example! This is the kind of thing that you expect to see as a hypothetical scenario in security textbooks, not on the front page of Slashdot.

So let's take a quick rundown of lessons learned from our hapless friends.

Backups, backups, backups.
The lack of external backups is what catapulted this from an outage and a headache for the remaining sysadmins into a practically worst case scenario. In short, mirroring is not the same as backing up.
Trust, but verify.
Just because you implicitly trust your sysadmins (otherwise they can't do their jobs) doesn't mean you shouldn't keep an eye on them. Use sudo to log commands, monitor configurations via tools like RANCID, and Puppet or Bcfg2.
Watch the watchers.
Along the same lines, don't let one person exclusively handle any important project. One bad apple working in isolation will have a much, much easier time planting logic bombs than one who has one or two others working side by side.
Don't give them a chance to pull the trigger
Going to fire a sysadmin? Any hint of a possibility of a chance it might get ugly? Be prepared to make sure that any and all rights that admin has are completely gone by the time they know they're getting fired. And please note that most sysadmins will take sudden revocation of their rights as a hint they're getting fired, so the chat with HR should probably happen simultaneously with at least two other trusted admins pulling rights and locking accounts.
Cleanup after their messes.
Dislike a sysadmin enough to get rid of them? Then that same dislike and mistrust should extend to all of the work they've done for you. As soon as they're out the door, it's time to audit what they did. Make sure the work you didn't know they did is up to standards, and make sure to look for backdoors and time bombs.

It's too late for those poor souls at journalspace, but hopefully they'll at least serve to inspire others to fix something.