Friday, May 1, 2009

Wow.

I hereby nominate this Schneier post for the Understatement Of The Year Award:

Nuclear war is not a suitable response to a cyberattack.

The very fact that such a response was needed is a depressing one indeed.

Tuesday, March 24, 2009

So long, and thanks for all the transactions...

Well, it's been a long time coming, but it's finally happening. The breakup will take quite a while, and it may never be a complete break. But in the end, after hundreds of tables and countless rows, I begin the process of replacing MySQL with PostgreSQL as our primary database.

But why, you ask? Not for any of the reasons you're probably thinking of. It's not because of any of the limitations of any of the foreign key implementations. It's not because of the cases where MySQL's serial data processing nature conflicts with what set theory says you should be able to do, such as modifying a table that's referenced again in a subquery. It's not because PostgreSQL is substantially closer to fully standards compliant SQL, and therefore a lot of the commercial big names such as Oracle. And no, we never had any catastrophic data loss due to any less than ACID aspects or a MySQL crash.

No, overall MySQL has been quite good to us. We're switching for the simple and inescapable reason that, unlike PostgreSQL, MySQL can not store IPv6 addresses in a usable form.

Now before you come up with some bright idea, take my word for it - I've already looked at it, thought it through, and utterly rejected it. Once you strip away the slightly funky formatting conventions, an IPv6 address is at its core simply a 128 bit integer. Too bad that MySQL only supports up to 64 bit integers, meaning that the IPv4 method (store it in a 32 bit integer) doesn't work. You can stick it in a text string, but then you get formatting normalization issues, and you can't efficiently perform any bit masking operations, which are required for things like determining which subnet a given address belongs to. Cramming it into a decimal type is a little bit closer, but sadly the bitwise operators such as '|' and '&' will silently truncate the output to 64 bits, making it - and anything else - utterly useless.

Quite simply, as comfortable as we are with MySQL, IPv6 addresses just don't fit.

And then, like a ray of sunshine, there's PostgreSQL! It actually has a pair of native data types (inet and cidr) that can not only cleanly store either an IPv4 or IPv6 address and subnet mask or prefix length, but also work with a number of functions for some of the most common operations, such as extracting the network address or calculating the broadcast address. What is currently a flat out impossibility in MySQL is not only possible, but trivial in PostgreSQL.

My group is responsible for maintaining the network here. We live, breathe, and die by IP addresses. We register them, track them, and shuffle them around between databases, DNS, DHCP, and ACLs a hundred times a day. Imagine trying to sell a database to the phone company that can't store phone numbers, and you'll have a pretty decent feel for the position we're in with IPv6 addresses. IPv6 adoption may be slow now, but it's going to come sooner or later, and when it does come, it's probably going to hit critical mass and come fast and hard.

Sure, I'm not looking forward to the changeover. I need to worry about removing MySQL specific features from the schema, such as auto increment fields and set/enum data types. I have several gigs of absolutely mission critical data that has to be moved over without getting scrambled. And there are many, many lines of perl and SQL code that have to be tested thoroughly. But in the end, "difficult but possible" beats "not a chance" any day.

Thursday, March 19, 2009

Patents Vs. Innovation

If you follow the tech industry at all these days - and possibly even if you don't - you can't help but hear all about patents. Whether it's Microsoft suing TomTom over ridiculous FAT patents, or RedHat reluctantly assembling a defensive patent arsenal, the one thing that every agrees on is that patent law and practice has a huge impact on both the tech sector, and the economy at large.

One aspect that the guys over at Techdirt have continued to hammer at is the difference between invention, and innovation. Patents place a significant emphasis at protecting the "rights" of the inventor (rights which exist only because of patent law - how's that for circular reasoning for you?), to the point where innovation, the act of making the invention actually useful, is being harmed. I've come up with a simple analogy that I think helps clarify this argument.

Imagine that you're a manager with a couple dozen employees under you. (If you actually are a manager, this should be pretty easy.) Now, odds are that most of your employees are pretty decent. They're good at their jobs, you can rely on them to produce good quality work, but they don't often come up with game changing ideas.

Except for that one guy. You know, that one guy who, if you had to pick your replacement, you'd name in a heartbeat. He's the one who doesn't just come up with a way to make a work process faster, he shuffles things around and makes that entire process disappear, freeing up everyone's time to work more on things that make money.

Now picture he's come up with a new version of some form that everyone has to fill out fifty times a day. The new version is faster, more accurate, and will allow everyone to increase their client billable hours by 20%. Great! Everyone can start using it, the company makes more money, everyone gets raises, and everyone's happy.

Except that in this case, we're modeling our little company after the patent situation we're in. Now, the only person who can use the new form is your one go to guy. Everyone else is stuck with the old form. The company bottom line barely moves, no one gets the big raise, and everyone gets to just watch the one guy doing interesting work while they spend their day filling out the form they're not allowed to use because they didn't come up with it.

So I ask you this: as the manager, which set of rules would you rather try to run a company under?

Friday, January 23, 2009

Automatic Updates: Frying Pan vs Fire

Ah, yet another IT related disaster. They make such great blog posts! This time, though, unlike my last one which was a clear cautionary tale about backups worth of a Grimm Brothers tale, the story of the poor boffins at Sheffield Teaching Hospitals Trust has no such simple answer.

Search around with any kind of security related checklist, column, book, blog, or chewing gum wrapper, and one of the constants will be to make sure that all of your security patches are applied. After all, you can be pretty sure that if the hole is sufficiently well known enough for the bad guys to be ready to exploit it. The simplest, most effective, and efficient way to do this is to simply enable whatever automated mechanism your OS has, whether it's up2date on RedHat, updatesd on Fedora, or Automatic Updates on Windows.

In the case of Sheffield, they opted to disable Automatic Updates, and were promptly rewarded with a hospital wide virus outbreak. While the hospital was at least wise enough to design their workflows such that they were able to maintain an acceptable level of patient care, at a minimum they're throwing money out the window on cleanup efforts, including virus removal, and secondary effects such as having to reschedule non-critical procedures.

At first glance - and in just about any other such tale - the moral would be a simple "Leave Automatic Updates on!" But there's a catch. Why were Automatic Updates disabled, you ask? As a matter of fact, until just a few days prior to the outbreak, they were not only enabled, but had a domain policy ensuring that it remained enabled, verified the patches got installed after an internal testing period, and forced a reboot to make the patches take effect, rather than leaving the machine running with the old vulnerable code still in memory.

In exchange for their diligence, Sheffield ended up with a PC deciding to reboot in the middle of a surgery. Can you just imagine being the poor front line helpdesk schmuck who has to explain to a surgeon why his computer decided to reboot all of a sudden? Trying to tell him or her that it's really for the best with a straight face?

Security patches are a critical part of ensuring security of any computer system. Not applying them entails risk; however, given that applying these patches will by definition change behavior somehow, applying them carries its own risk. For far too many computers (mostly, but by no means limited to, Windows), sprinting along on the patch treadmill is the only line of defense against any other machines on the same network. The OS itself is brittle, with nearly any intrusion easily leveraged into total control of the entire machine. Progress is being made, such as UAC on Windows, or SELinux, but that doesn't help with the millions of legacy machines already out there.

Patching these days is a nasty catch 22. With every patch release, you have to take a guess which will be worse - the fallout from applying the patch, or the fallout from not applying the patch. Admins with strict requirements for both availability and security are stuck walking a narrow path, without even any assurance that the patch even exists.

Friday, January 2, 2009

Journalspace Gets Creamed

If you can't be a good example, then you'll just have to serve as a horrible warning. — Catherine Aird

By now, I'm sure just about everyone will have heard about the disaster that has fallen upon the poor SOBs at journalspace.com. The short story is that the server that hosted all of the data for the blog site got hosed, and lost all of the data. Some of the high points:

  • The data were stored on a RAID1 array - a pair of mirrored drives
  • There were no bakcups, or any backup system in place at all
  • The drives did not fail, but were both completely overwritten on every block
  • No conclusive root cause was found, but a recently departed sysadmin had already been caught doing "a slash-and-burn" on other systems

So in the end, it looks extremely likely that an incompetent sysadmin set the system up with no meaningful backups, and then progressed to a malicious sysadmin by performing a thorough wipe of the only copy of the system data as he was shown out the door. What a wonderful cornucopia of lessons that can be gleaned from this one example! This is the kind of thing that you expect to see as a hypothetical scenario in security textbooks, not on the front page of Slashdot.

So let's take a quick rundown of lessons learned from our hapless friends.

Backups, backups, backups.
The lack of external backups is what catapulted this from an outage and a headache for the remaining sysadmins into a practically worst case scenario. In short, mirroring is not the same as backing up.
Trust, but verify.
Just because you implicitly trust your sysadmins (otherwise they can't do their jobs) doesn't mean you shouldn't keep an eye on them. Use sudo to log commands, monitor configurations via tools like RANCID, and Puppet or Bcfg2.
Watch the watchers.
Along the same lines, don't let one person exclusively handle any important project. One bad apple working in isolation will have a much, much easier time planting logic bombs than one who has one or two others working side by side.
Don't give them a chance to pull the trigger
Going to fire a sysadmin? Any hint of a possibility of a chance it might get ugly? Be prepared to make sure that any and all rights that admin has are completely gone by the time they know they're getting fired. And please note that most sysadmins will take sudden revocation of their rights as a hint they're getting fired, so the chat with HR should probably happen simultaneously with at least two other trusted admins pulling rights and locking accounts.
Cleanup after their messes.
Dislike a sysadmin enough to get rid of them? Then that same dislike and mistrust should extend to all of the work they've done for you. As soon as they're out the door, it's time to audit what they did. Make sure the work you didn't know they did is up to standards, and make sure to look for backdoors and time bombs.

It's too late for those poor souls at journalspace, but hopefully they'll at least serve to inspire others to fix something.