Wednesday, December 19, 2007

Coding and Complexity

First off, let me just make a quick confession - while my undergraduate degree was stamped with "Computer Science" as my major, I don't really consider myself to primarily be a programmer. Sure, I do actually spend a good number of my days mucking around with writing code (usually Perl, occasionally Ruby), but my job is really IT support, specifically networking. I deal with switches, routers, wireless, VPN, and a handful of Linux servers supporting the network with DNS, DHCP, etc. The code writing that I do is almost exclusively to support everything else, such as working on a host registration system or device monitoring scripts. The software I write is to directly address a need, rather than to be sold to address someone else's need.

That said, when I just read Steve Yegge's latest rant, Code's Worst Enemy, it struck a chord with me.

I happen to hold a hard-won minority opinion about code bases. In particular I believe, quite staunchly I might add, that the worst thing that can happen to a code base is size.

Now, as someone who does not consider himself a spectacular coder by any means, I would certainly feel quite daunted by tackling a 500k line codebase by myself. On the other hand, as a professional coder, Stevey ought to be able to casually fling around great swaths of code, using advanced software repositories and indexing tools, right? But no - he feels that, all other things being equal, less is more.

One feature of large code bases that I think he gave short shrift to was the idea of complexity. He talks a little bit about how complexity certainly makes a given code base harder to work on, and also that some of the automatic tools, such as refactoring, that try to deal with it just make the problem worse by bloating the code base even more.

This is something significant in his argument, I think. In this example, we have two code bases, before and after being run through the automatic refactoring tool. The initial state has a given level of functionality, size, and (for lack of a better word), "goodness". The final state greater size, and therefore less goodness, but identical functionality! This mirrors his stated goal of taking his existing game, and rewriting it with identical functionality but less than half the lines of code.

I think the explanation boils down to this: we can only fit so much in our brains at a time. Great programmers can mentally swap in more of the big picture at once, but everyone has their limit. This limit is why we decompose programs down into manageable subroutines, each of which can be understood (at least partially) in isolation from the rest. It is why we hide massive chunks of functionality behind a handful of calls into a library. The smaller chunk size we're working on, the more likely we are to be able to fully understand it and not screw up.

From here, the trick to making sense of Stevey's size argument is realizing that there are two completely different kinds of complexity at play here. If you're writing code to do, say, an FFT, you've got to know the math behind it and how it works. That's a fair bit of complexity that you've got to hold in your heard, and it's going to remain constant regardless of whether you're developing in Java, Ruby, C++, Assembly or BF.

This invariant portion of the complexity is what I call inherent complexity. (Please don't tell me if that term isn't original; I know if probably isn't, but I like to pretend.) It's the piece that you can't get away from, since it's what defines the actual problem you're trying to get that hunk of copper and silicon to solve for you. It's the tax code embodied in Quicken, the rules of mathematics in Mathematica, the graph theory in Garmin and TomTom. Remove the inherent complexity from a problem, and all you've got left is a very complex, boring video game with executables instead of high scores and compiler errors instead of health damage.

If the inherent complexity were all there was to it, then knowledge of the problem domain would be all that's required. You wouldn't need a programmer to write Mathematica, just a mathematician to sit down and tell the computer everything she knows about math. Easy, right?

Sadly (or fortunately, if you make a living as a programmer) this is not the case. The person coding has to know extra details that are outside of the problem domain, like the fact that the number 0.1 cannot be represented with absolute precision in a floating point number. Or that if you accidentally tell a computer to loop forever, it will do so. Or that each of these three different sort routines will produce the same final product, but the memory and time requirements can vary by an order of magnitude or more - and not always in the same order, depending on the data set. Not to mention nitty language details, like dealing with pointers in C or "bless" in Perl.

All of these other layers upon layer of crap that gets wrapped around the real problem is just extraneous complexity. I mean, let's be honest - learning objected oriented design or unit testing may help you write code faster and with fewer bugs, but won't help with bullet point one of the design requirements for an ERP (or online order system, or factory automation, or... ). It's all work that is, in the end, unquestionably important to creating a finished product, but any time spent working on that extraneous complexity is time not spent on the inherit complexity.

Or, to put it more bluntly, any time you spend appeasing your programming environment is time that you're not spending on solving the actual problem.

Based on this, the best development languages are ones that are fairly thin, succinct, and in general just get the hell out of your way and let you work. Go back a few decades, and compared to the alternatives of the time, this is what C was. The book that was for many years the definitive guide to C was under 300 pages long, and let the programmer almost completely ignore the messy details of things like programming in assembly. Loops and conditionals were suddenly a simple, easy mnemonic syntax.

More recently, I think this "thinness" is a huge portion of the success of Ruby on Rails. Starting from a database schema, you can literally create a functional skeleton application in minutes with just a few commands, with all of the components already laid out neatly organized and slots already created for niceties such as porting to different databases, unit testing, and version control.

Sure, it's all stuff that any competent programmer can easily handle, but automating it frees up that many more brain cells to do whatever it is the client or employer wants to give you money for.

Monday, December 17, 2007

Frank's Law of Foreign Key Constraints

While bouncing around between a handful of typical LAMP style applications, I've come to a harsh realization of a brutal truth:

Those who do not learn proper foreign key constraints are doomed to create an incomplete, buggy implementation of them in their application.

Minus 50 million points to MySQL for creating an entire generation of web programmers who have only a vague, fuzzy idea of what constraints are by shipping a version that either didn't have them, or defaulted to a table type without them, for so long.

Wednesday, December 12, 2007

Blacklists and You

Blacklists. Whether they're for virus signatures, firewall rules, or spam filters, every security guy who's spent more than 15 minutes in the business knows then, loves them, and hates them. Coding Horror has a mostly right article up summing it up titled, quite simply, Blacklists Don't Work.

On the one hand, all of the downsides he lists are dead on. Most of the reason that we frantically run around installing anti-virus software on Windows boxes are directly traceable to horribly shortsighted design decisions made as far back as MS-DOS. (Heck, search around, and you'll still occasionally find people having problems due to 8.3 filename restrictions!) And yes, blacklists are horribly inefficient, a royal pain the maintain, and often easily bypassed. After all, there's nothing whatsoever stopping our Evil Virus Author from taking his latest malware and running it through the dozen most popular virus scanners to make sure it slips by all of them.

But really, what are the other options? Are we to truly believe that there is some magic silver bullet waiting in the wings, parked next to the car that runs on water and an eclipse plugin that can tell when you typed ">" but meant ">="? Jeff puts forward the same idea that Microsoft has been painfully pushing in for years - forcing users to run as regular users instead of as administrators all of the time. Now, to be sure, this is absolutely something worth pursuing, both for security and general reliability issues. Ask anyone who maintains an open lab on a college campus how much fun it is trying to keep the right printer drivers installed and working when anyone can do anything they want on the machines!

Even this idea falls short, though. Most of those lab computers and corporate desktops, where you have site administrators who can hoard admin privs to themselves, aren't the real problem. Those computers are the ones with people babying them already, making sure passwords are strong, patches are up to date, and virus scanners are running. Sadly, it falls short when applied to Aunt Millie. She will gleefully open that email from her anonymous new best friend, follow the directions to open the encrypted zip virus, and do whatever is necessary to firmly embed the virus deep in her computer.

Even if you take away administrative rights, in a few months those same hackers will quickly start installing programs in My Documents, and use the same startup mechanisms that legit apps do. After all, it's not like you really need full system control to send spams or participate in a DoS attack. And if you do, once you get a program running on the computer, there are usually plenty of privilege escalation bugs and attacks that can get you the rest of the way, regardless of what level the user launched the program at.

The problem is that, as bad as they are, it's not quite fair to say unconditionally that blacklists don't work. They're slow, annoying, have lots of holes - in other words, they work quite horribly - and, like democracy, also happen to work better than any other workable solution out there right now. I'll agree 100% that we need to start building systems where security is just as important a design goal as reliability and profitability, but until we figure out a way to divine the intent of a given program, some form of blacklisting will always be with us.

Sunday, December 2, 2007

Shared dedicated or dedicated shared?

I like having internet at home. Sure, it's not quite the same as multiple 30M+ pipes at work, but it's plenty fast enough to waste time on youtube and settle arguments with wikipedia. These days, most people have pretty much two options for home connections with decent speed: DSL over phone lines, or cable modem over CATV lines. (At this point, I'm not really counting FIOS yet.)

Now the primary thing that you want from an ISP is a reliable, fast internet connection. All of the other fluffy, feel good benefits like more free email addresses, little bits of web storage, etc don't really count for much if your web pages take minutes to load. One of the little canards that DSL providers love to throw that really, really bugs me is the "DSL is dedicated! Cable is shared!"

I'm a network guy. I build and maintain 'em for a living. Now, it's true that with cable modems, the bandwidth is shared per coaxial segment among all of the customers on that segment, while each DSL customer gets to use all available bandwidth on that particular dedicated pair of lines. But guess what all those dedicated lines do? That's right, they go into a set of equipment (routers and uplinks) that are - horrors! - shared.

There isn't a network on this planet that doesn't do some level of oversubscription. Cable modem providers simply have to allocate enough bandwidth to each neighborhood loop to satisfy the actual demands, just like DSL providers have to do with their aggregation points.

Now, when an ISP starts advertising with promises of no hidden bittorent filters, secret P2P filters, or anti-criticism termination clauses - in short, the things that the Net Neutrality people have been lobbying for - then I'll care.

Wednesday, November 21, 2007

Swatting (computer security) flies with a bazooka

In every field, there are some bad ideas that just won't die. In medicine, you have ideas such as curing cancer by pushing bones around with chiropracty. In the audio world, people get suckered into searching for "oxygen free" cables to somehow make that song, recorded in 1973 onto an 8 track, sound perfect. In the field of computer security, one of my personal favorite bad ideas is to make a magic virus that, instead of going around doing bad things, will slip in through existing security holes and clean out other viruses, install patches, and turn on firewall rules for the poor uneducated users. Dan Geer is the latest one to tie some strings onto this zombie and make it dance around.

Now, to someone who's only vaguely familiar with computers, or is used to dealing with one or two systems, this may sound like a good idea. I mean, who wouldn't love to have a magic program swoop in and clean house like an uber-l33t Mary Poppins with a keyboard? For added fun, Dan has added in an extra twist. Rather than the usual infection vector of scanning around the network (just like the viruses that people don't like), Dan proposes that secure web sites should ask users if they want security. If they say "yes", pretend that they're obviously competent and should be trusted, and run as normal. If they say "no", pretend that they're obviously idiots and fling the magic pixie dust back at their computer that keeps the Big Bad Scary Hackers hiding under the bed.

Let's start with us poor slobs stuck actually supporting the reality of computers, often in the hundreds or thousands, rather than in the idealistic realm pundits love to live in. To us, the idea of some random local bank or knick-knack vendor actually running arbitrary code on machines we have to keep going is downright terrifying. Writing this kind of code is hard - really hard. Don't believe me? Just ask Microsoft, who managed to release a silent, unblockable patch to the automatic update system that in some casesstopped updates from being installed. And that was an update only applied to Windows XP machines at a minimum patch level - imagine trying to make something so complex perfectly reliable and secure on all patch levels of Windows 98, 2000, XP, and Vista, not to mention Mac OS and Linux!

What shall we poke at next? I know! How about assumption that this code that gets downloaded to the poor computer is somehow safe itself? I mean, the whole purpose of this magic program is to make things safe on already infected machines - easy, right? Hah! Just ask the folks who spent millions creating the content protection scheme used in Blu-Ray about the impenitrability of BD+. (I'll give you a hint - it's been cracked.) Fundamental computer security 101 - once the OS is compromised, it's pretty much game over for any other programs running on it. Half the viruses out there disable the most popular virus scanners; if this magical security bit becomes at all popular, there's no reason to think it won't be targeted as well.

Okay, I think we have time for one more, so let's make it a good one. Let's assume for a moment that Mr. Geer manages to hire Tinkerbell, ensuring an adequate supply of pixie dust to the magical program work as designed. How much would that actually accomplish?

  • That single transaction - secured.
  • Any time the user logs into another site with the same password - unsecured!
  • Executables sent in email or instant message links - unsecured!
  • Phishing emails telling users to type their passwords into malicious sites - unsecured!
  • Malicious sites lurking on common typos of legitimate domains - unsecured!
  • Users picking bad passwords - unsecured!

I could go on listing other things that this idea wouldn't protect against, but I think you can see the pattern. Even if this idea could somehow be made to work completely properly, it's pretty doubtful that it would make a substantial dent in the problem of securing the computers of unskilled end users.

Well, it's time for me to head off to bed, so I'll leave you with this closing thought. Let's stick with the assumption for a moment that someone does come up with some magic <make-it-secure> HTML tag you can stick onto any web page. Instead of trying to use a yes/no dialog box as an ouija board to guess whether the computer is secure or not, why not just use the damn thing on every sensitive page?

Thursday, November 15, 2007

Too much logging is almost enough

Not too long ago, I stumbled across Log Everything All the Time on the site High Scalability. The short version is that the classic method of turning on debug logging when needed to track down a problem is useless because it implies that logging was off when the problem actually happened. Anyone who's dealt with an intermittent problem (only happens two or three times day, or one out of a thousand transactions, or if at least six people are wearing blue jackets) in a complex system can attest to the futility of trying to divine which combination of the tens or hundreds of thousands of events flowing by per day actually blew up. This is the deadly and elusive Heisenbug, an easily startled creature which runs away from the debug switch like a cockroach from a light switch. This particular article was written from the perspective of writing a large scale application. Building a system in this way has the huge advantage that you (where you is the programmer) get to drop debug and analysis code in at any point in the program flow. Suspect that data structures aren't getting initialized right? Print out a dump of the data structure right before the initialization routine sends it back. Garbage strings showing up in the database? Trace the entire lifecycle of the data that go into it, with a before and after every piece of code that touches them. Now that's all well and good for you lucky guys are busy actually building systems on general purpose computers from code up. But what about us poor schmucks who are running the networks that let those web 2.0 apps fling AJAX across the globe? By comparison, we're fighting with one hand tied behind our back while blindfolded.
  • Most network devices are, for all intents and purposes, embedded devices. This means the code running on them is far more rigidly fixed than on, say a Linux server. Open source apps can be patched and recompiled, PHP pages can have extra print statements thrown in, and even fussy vendor binaries can be spied on with strace and tweaked with library preloading. With your typical network device, though, unless you've got a really good service contract, you're pretty much stuck with whatever you've got.
  • The built in even logging on most network devices tends to be a bit on the anemic side. At best, you'll typically get reports of what the device considers to be unusual events. At worst, you'll get nothing at all. I've worked with at least one switch that taunts you by by just logging when an event occurs. No details, not even what even occurred - just a timestamp.
  • The state of a network is almost all transitory. Transactions on servers, even without logging, tends to leave various breadcrumbs behind - database entries, file timestamps, emails, etc. Once a state table entry for a stateful firewall expires, though, no trace is left behind.
The best remaining option is to constantly squeeze snapshots of the state of the network and track them. Then, later on, when someone comes to you with a report of a problem from three days ago, you at least stand a fighting chance of piecing together the half dozen otherwise normal events that line up to point at the real problem. For example, we once had a professor who's machine kept falling off the network over weekends. All of the usual troubleshooting techniques - look for errors on the port, test out the cabling, etc - showed everything perfectly normal. Looking through our more extensive state monitoring logs, we found two sets of interesting events, however. On Saturday morning, the port briefly lost a link, and a different MAC address started showing up on the port. Saturday evening, the link flickered again, and the original MAC address showed up again. Looking up the address showed that it belonged to a grad student working for the professor. The professor's own assistant was unplugging his machine! Without logging all of the "normal" state of the network, we'd probably never have found the real cause. So what information should you log? Here are what we log for network troubleshooting.
Switch FDB
This gives you the record of where all of your machines have been. More importantly, it can tell whether or not a machine was talking on a port at a specific time, rather than relying on a users guestimate.
ARP Tables
Utterly indispensable for catching misconfigurations. We've all had that bozo who set his IP address to the local gateway, or a nearby server. With this info, you already have a record of the offender, rather than tracking it down manually.
DHCP Leases
If you don't have fixed addresses everywhere, then without tracking DHCP leases, it's much harder to tell what machine a given IP address belonged to at a given time.
DHCP Fingerprints
By capturing extra information about DHCP lease requests, such as the option request list and the VCID, it is possible to do passive OS identification using techniques like those used by PacketFence. This can be particularly useful when tracking down disallowed or problematic devices, such as SOHO NAT routers.
IP Flows
An invaluable record of who talked to who when. You can either grab it from your routers directly with flow exports, capture it using span ports and a package like Argus, or even better, both. By capturing snapshots of IP flows from multiple points and looking for discrepancies, you can narrow down exactly which one of the three routers and five switches in the data path threw away a few packets. Also, your security guys will love this for helping to track down patient zero in virus outbreaks.
SNMP Traps
These are often your best insight into what the network devices think are going on. BGP session flapping, port security violations, failed login attempts - often times you can find events like these being generated in traps that don't even show up in the device log files.
The final thing to remember is however you end up collecting this information (along with whatever other data sources you can come up with for your network), make sure they all go into a repository where you can easily drill down and jump around between data sources. You'll want to be able to look at a switch port, find the MAC address on it, look up it's OS and IP in the DHCP data, and get a list of all other machines it's talked to in the last 12 hours. Make sure you can easily browse through your data like this, and as you start playing around with it, you'll end up finding solutions to problems you didn't even know you had.

Sunday, November 11, 2007

Privacy without anonymity?

So by now, anyone who cares has no doubt heard about the comments from Donald Kerr, deputy director of national intelligence. His claim is that nowadays, the American public has no choice but to abandon the idea of anonymity, and instead simply place their faith in the government and corporations to handle their private data properly.

Give me a break! If there's a more classic example of the foxes guarding the henhouse that doesn't involve an actual farm, I certainly can't think of it.

Let's ignore for a moment the fact that so far, these foxes have proved to be horribly incapable of properly handling data. Forget for a moment that from Choicepoint selling data to thieves, to TJX utterly failing to secure their own systems, to the countless laptops and backup tapes lost or stolen, the IT industry as a whole has not given consumers much reason to sleep soundly.

An essential aspect of free speech is the ability to speak your mind without fear of reprisal. While it's a wonderful ideal to be able to simply declare you can't be punished for speaking something unpopular, the harsh reality is that sometimes it can't be avoided. Look, for example, at the whistle blower protection laws. The fact that you can report illegal activities that a company you work for without that company knowing you did the reporting is absolutely essential. Would you turn if your boss if it was going to be months or years before results happened, and he would know what you did?

Kerr's statement is the worst kind of lie - one hidden in the middle of a bundle of truths. Do we all need to adjust our expectations and behavior regarding privacy in the age of Google? Clearly. Should people be able to expect that their government and companies will take care of their data properly? Absolutely.

But to suggest that these are an acceptable substitute for anonymity is both foolish and dangerous.

Wednesday, November 7, 2007

DRM vs Consumers...

Ah, so it begins. For some time, consumer rights advocates have been concerned about the power that Digital Rights Management (or, as some more accurately call it, Digital Restrictions Management) grants to conent producers over consumers. Now, MLB has apparently decided that it really doesn't care about screwing over customers who bought DRM locked content.

In short, MLB has decided to change the technical details in how they protect any content they sell (in this case, videos of old games you can download and burn to disc). Who cares, right? Well, it turns out that the old DRM they use checks back in to a central server every time you go to play it. And, as part of the change, MLB sorta removed the magic server bits that convinced the player to show you what you paid for. All of those files you downloaded - and paid for! - are now just encrypted noise taking up space on your hard drive.

Oops.

I'd bet that, given the noise that's starting up, and the complete idiocy of what MLB has done, they'll bashfully find some what to at least look like they're making up for it - gift cards for screwed customers towards repurchasing the content (still under DRM, of course), or something equally feeble. A more interesting question is, what lessons will be learned from this on the industry side?

Microsoft Vista is shoving DRM features deeper and deeper into the core OS. Heck, the volume licensing arrangement already includes a component that talks to a server every 30 days, and being out of contact for too long will cripple your computer. The issues with performance, stability, application compatibility, and overall quirkiness have been enough to make resellers revolt and continue selling XP.

In both of these cases, the guys selling stuff have a de facto monopoly. No one but Microsoft gets the final say in what happens with new versions of Windows, and hanging out at your local high school baseball games just isn't the same as watching Don and Remy at Fenway. In the end, they'll be able to push through a lot of this kind of crap, and still come out reasonably well. This doesn't mean it's a good thing to build a business model around pissing off your consumers, however. Apple has had enough pressure that they're even starting to sell music without any DRM, and plenty of people have talked about Radiohead and the unencumbered downloads.

So the question now is, which lesson will the industry learn? That if you have enough momentum to not care about your customers, you can get away with DRM, or that if you offer your customers enough value, you won't need to bother with the expense of DRM in the first place?